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Preface 


This book is designed to give a clear introduction to particle physics at 
a level that will be accessible to advanced undergraduate students. Most 
of the book concentrates on the ‘Standard Model of Particle Physics’— 
what it is and the experimental evidence that supports it. The book 
fills a gap between the more qualitative introductory texts and the more 
mathematically advanced graduate textbooks. Our aim is to teach the 
maximum amount of physics, with the minimum level of maths. This 
book is in the spirit of Perkins’ classic textbook but updated to the 
LHC era. Particle physics is an experimental science; accelerators and 
detectors have been essential for progress in this field. The unique feature 
of this book is that it gives a serious explanation of the practical side of 
the subject at an accessible level for undergraduates. This will provide 
students with some real understanding of these subjects and equip them 
to appreciate the many excellent graduate-level textbooks in these fields 
and to follow published papers. The core of the book covers the theory 
and experiments that underpin the Standard Model. Neutrino oscilla- 
tions and flavour oscillations of the neutral strange, charm, and beauty 
states are explained carefully, including the violation of CP symmetry. 
The book covers the discovery of the Higgs at the LHC, explaining the 
critical issues and how one can extract such a small signal from a large 
background. We discuss the problems with the Standard Model that 
give a very strong indication that there should be physics at the TeV 
scale, Beyond the Standard Model (BSM). We summarize some possible 
BSM theories that could solve these problems and give examples of LHC 
searches for BSM physics. We review the evidence for dark matter and 
consider how LHC and other experiments are searching for it, and we 
look at the evidence for dark energy and its theoretical consequences. 

Each chapter has questions to help students deepen their understand- 
ing of the subject, some of which are based on those used in teaching 
this subject at Oxford as a 4th-year physics major option course. 

The official OUP website for this book is http://ukcatalogue.oup.com/ 
product /9780198748557.do. This contains a link to a website ppLHCEra. 
physics.ox.ac.uk maintained by the authors. We are maintaining a list of 
errata on this website and we would appreciate receiving corrections via 
the link on the OUP website. Many new results will be appearing over 
the next few years. We provide links to Particle Data Group reviews 
and to websites for some of the current experiments. Finally, suggested 
solutions are available to course instructors: a request form is available 
on the OUP book website. 
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Introduction 


The aim of this book is to provide a practical introduction to particle 
physics in the LHC era at the level of an advanced undergraduate or 
introductory graduate course. It fills a gap between qualitative intro- 
ductory texts and advanced texts based on relativistic quantum field 
theory. We give a clear and concise explanation of key theoretical con- 
cepts and their grounding in experimental measurements, with as little 
use of advanced mathematical techniques as possible. The exceptions 
are a fairly detailed coverage of exact and broken symmetries and gauge 
invariance. The language and techniques of relativistic quantum field 
theory are not used, but relativistic quantum mechanics is covered, fo- 
cusing on the Klein—Gordon and Dirac equations. The book focuses on 
the physics of colliders, particularly those delivering the highest en- 
ergies: proton-proton (LHC), electron—positron (LEP and ILC), and 
electron—proton (HERA). Experiments and results from older and/or 
lower-energy electron—positron colliders are discussed when necessary, 
for example the so-called B-factories. Fixed-target experiments, par- 
ticularly those using neutrino beams and those studying neutral meson 
oscillation phenomena (K°, B°) are outlined. Finally non-accelerator- 
based particle physics topics, such as the observation and measurement 
of solar neutrinos and the experimental search for dark matter, are 
described briefly. 

This chapter introduces the fundamental particles and the forces with 
which they interact. We will use ‘natural units’ and these are defined in 
the next section. Accelerators and colliders are introduced in a histor- 
ical context that makes clear how much of our current understanding of 
fundamental particles and forces relies on the steady increase in inter- 
action energy made possible by advances in accelerator technology. An 
important feature of this book is a description of how modern electronic 
detectors work—particularly the large detectors built to study proton— 
proton collisions at the multi-TeV scale of the LHC. Coupled with this 
is the enormous increase in computing power over the last half-century 
and the organization of this power on a global scale using the worldwide 
web,! which enables scattering events to be selected and reconstructed 
in almost real time. 

The chapter ends with a brief resume of the rest of the book. 
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lThis is known as GRID computing. A 
local example used by the authors is 
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It is said that the name ‘barn’ 
originated when early measurements of 
nuclear scattering cross sections were 
larger than expected: ‘as big as barn 
doors’. 


3Compare gravitational and electro- 
static forces between two protons at a 
separation of 1071? m. 


1.1 Units 


Studying matter at subnuclear scales requires interactions at very high 
energies. SI units are very useful in many contexts, but for this subject 
they would require us to keep track of quantities, particularly energies, 
to large negative powers. Particle physics uses the natural units of 
MeV or GeV (= 1000 MeV) for energy, the femtometre (also known as 
the fermi, 1fm = 10715 m) for distance, and the barn (1b = 107?8 m?) 
for area (used for cross sections).? 

By using natural units, we can set A = c = 1 in our calculations. So, for 
example, the familiar relation between energy and momentum in special 
relativity becomes simply E? = p? + m?. In natural units, mass, energy, 
and momentum have the same units, which simplifies dimensional ana- 
lysis checking. At the end of a calculation, we might need to convert 
the answer to practical units, which we can do very simply by using the 
conversion factors fic = 197.3 MeV fm and (fic)? = 0.3894 GeV? mb. 

We will always use natural units in this book, unless explicitly 
indicated otherwise. 


1.2 Early days 


Particle physics had its roots in nuclear physics and cosmic-ray physics. 
Its remit is the study of the fundamental building blocks of matter and 
the interactions between them, particularly the strong, electromagnetic, 
and weak interactions. The gravitational interaction does not play a 
significant role in most of the topics covered in this book.’ 

In the 1930s, it appeared that atoms and nuclei could be understood in 
terms of a rather small number of constituents: the proton and neutron; 
the electron and positron; the neutrino and antineutrino. Electromag- 
netic interactions were assumed to remain described by the theories 
of Faraday and Maxwell at subatomic distance scales, but consist- 
ent relativistic calculations required the development of quantum field 
theory and, in particular, an understanding of renormalization (see Sec- 
tion 1.2.1). Weak interactions were more of a problem. The contact inter- 
action developed by Fermi was very successful in bringing order to a wide 
range of phenomena, but it was not renormalizable and hence did not 
allow reliable higher-order calculations. The strong nuclear force has a 
range of a few fermi (10715m), the scale of a nucleus. The strength of the 
electromagnetic interaction is given by the fine-structure constant, a = 
e? /4rhe, with a value I Defining a ‘strong nuclear charge’ ag similarly 
gives ag ~ 1 at a distance of a few fermi. The scale of weak interactions 
is given by the the Fermi constant G/(fic)? = 1.166 x 1075 GeV~?. 

This simple picture could not accommodate the discovery of new par- 
ticles by cosmic-ray physicists: the muon by Anderson and Neddermeyer 
in 1936 using a cloud chamber within a magnetic field; the pion in 
1947 by Powell’s group in Bristol using specially developed photographic 
emulsion; and ‘strange particles’ with V-shaped or kinked tracks by 


Rochester and Butler (1946-47) using a coincidence-counter-controlled 
cloud chamber. What was strange about the new particles was a much 
longer lifetime than would be expected for a ‘normal’ strongly interacting 
particle with a comparable mass. A new quantum number, strangeness, 
was introduced independently by Gell-Mann and by Nishijima in 1953 to 
explain this. Their proposal was that strangeness was conserved in strong 
and electromagnetic interactions but not in weak interactions. The light- 
est strange particles could only decay by a strangeness-changing weak 
interaction. 

This was just the beginning. In the early 1950s, the development in the 
USA and Europe of synchrotrons capable of delivering particle beams 
with GeV energies and of the bubble chamber led to a plethora of new 
short-lived particles. 


1.2.1 Particles and forces 


The forces that we are most familiar with at a macroscopic level are 
gravitation and electromagnetism, and at this level a particle may be 
defined roughly as a ‘point-like’ object that has a well-defined mass and 
charge. This also works reasonably well at the scale of atoms (1078 m), at 
which electrons, protons, and neutrons can be considered point-like. In 
quantum mechanics and quantum field theory, a force is described by the 
exchange of a field-quantum—in the case of electromagnetism, the pho- 
ton (y). One of the consequences of Dirac’s attempts to find a physical 
explanation for the troublesome negative-energy solutions of his other- 
wise very successful equation describing the electron was the prediction 
of antiparticles (1930-31), in particular the positron. An antiparticle is 
a particle with the same mass and spin as a particle, but with opposite 
electric charge. A neutral particle can be its own antiparticle, for ex- 
ample the photon. Feynman’? has given a simple but elegant argument 
that antiparticles are a necessary outcome of a relativistically invariant 
description of particle interactions. Shortly after Dirac’s prediction, the 
positron was discovered by Anderson in 1932. By the late 1940s, the 
development of renormalized quantum field theories? gave a consistent 
way to handle the infinities that seemed inherent in any quantum de- 
scription of particle creation and annihilation. The paradigm is quantum 
electrodynamics (QED), for which the mass and electric charge of the 
electron are two of the parameters that are renormalized, the so-called 
vacuum energy® being the third. After renormalization, QED provides 
a theory that has enabled amazingly precise calculations of quantities 
such as the magnetic moment of the electron.” 

At a deep level, much of the thrust of particle physics in the twentieth 
century was to find out if the strong and weak nuclear forces could 
be described by renormalized quantum fields, and, if so, to discover the 
related field quanta. We now know that this is the case, with the W= and 
Z? providing the weak force and the eight massless gluons (g) the strong 
force. To describe these interactions, more complicated field theories are 
required, but they have been shown to be renormalizable. 
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4:The reason for antiparticles’, in the 
1986 Dirac Memorial lectures—see the 
further reading at the end of this 
chapter. 


5A renormalizable theory is one in 
which infinities to all orders of per- 
turbation theory can be absorbed by 
the redefinition of a finite number of 
the parameters of the theory, such as 
masses and coupling constants. These 
parameters are then fixed from experi- 
mental observation. 


6A consequence of the quantum time— 
energy uncertainty relation, vacuum 
energy AF can exist for a time At, 
provided AFAt < h. 


T Quantum field theory is beyond the 
remit of this book, but some intro- 
ductory texts are listed in the further 
reading at the end of Chapter 6. 
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8A more detailed account of the group 
theory that we need is given at the end 
of Chapter 2. 


°The subscript ‘flavour’ is to distin- 
guish this symmetry from the exact 
SU(3)colour Of QCD. The subscripts 
may be omitted if the context is 
unambiguous. 


10 Quarks, antiquarks, and gluons. 


ll This is covered in the discussion of 
gauge symmetry in Chapter 6. 


12 For an ultrarelativistic particle, ‘left- 
handed’ refers to the component of its 
spin projected along a direction oppos- 
ite to that of its 3-momentum. 


13The subscript L is for ‘left-handed’, 
but also indicates that this is not the 
approximate SU(2) of nuclear isospin 
mentioned above. 


1.2.2 Group theory in particle physics 


Group theory is the mathematical description of patterns, both of phys- 
ical structures such as crystals and more abstractly of groups of related 
objects, for example particles with similar properties such as the pions 
(Tt, 7°). Group theory also plays an essential role in the mathematical 
structure describing the forces and interactions of fundamental par- 
ticles.8 The great benefit of group theory is that it provides much of the 
mathematical apparatus needed to exploit the underlying symmetries of 
the fundamental forces. 

As we shall explain in detail in Chapter 5, the mesons and baryons, 
composed of the u, d, and s quarks, occur in patterns of a symmetry 
known as? SU(3)gavour- This symmetry is approximate because of the 
difference in masses of the three quarks. It contains an SU(2) subgroup 
known as isospin composed of hadrons with only u and d quarks. The 
pion states just mentioned form an isospin triplet and the neutron and 
proton an isospin doublet. 

The strong interaction among the constituents!° of hadrons is based 
on an exact SU(3) group structure with three ‘colour’ charges. This is the 
theory of quantum chromodynamics (QCD), which underpins the phys- 
ics described in Chapter 9. The space-time structure of QCD is similar 
to that of QED, with an inverse-square-law force. Like the photon, the 
gluons are massless particles, but, unlike the photon, the gluons carry 
a colour charge—there are eight gluons. Further explanation is given in 
Chapter 9. 


Group theory in electroweak unification 


The most complicated use of group theory is in describing the ‘uni- 
fication’ of the weak and electromagnetic interactions to form the 
electroweak theory of the Standard Model. It is complicated because 
QED is a spatial parity-conserving force whereas the weak force does 
not respect this symmetry. Electrodynamics has another important 
feature—it is ‘gauge-invariant’. Maxwell’s equations are unchanged by 
a change in the electromagnetic 4-vector potential A, — A, — LA, 
where A is a scalar function. Under a gauge transformation, a wavefunc- 
tion changes by a phase Y% — we‘. In the language of group theory, 
this is a U(1) symmetry. This is much more than just a mathematical 
curiosity, since the replacement of 4-momentum of a charged particle 
Pu — Pu — eA, in the (classical) equations of motion generates the 
correct form for the electromagnetic interaction.!! 

The quanta of the weak force are the charged WF spin-1 bosons inter- 
acting via their ‘left-handed’!? states and the Z°, which couples to both 
left- and right-handed states but with different strengths. As both the 
photon and Z? are spin-1 states with zero electric charge, they can 
interfere—with a strength given by a mixing angle 0w (the ‘weak mixing 
angle’ or ‘Weinberg angle’). The group structure of the left-handed states 
is that of!3 SU(2),. Using group-theoretical language, this synthesis of 
electrodynamics with the weak interaction is based on a U(1)@SU(2)z 


group structure. In an analogous, but more complicated, procedure to 
that described above for QED, the full electroweak interaction can be 
generated by a suitable gauge transformation. Electroweak unification 
is covered in Section 7.4. 


1.2.3 Particles 


Particles, including the force quanta, are classified according to their spin 
and interactions. Leptons are spin-$ fermions that do not interact via 
the strong force: the electron (e) and associated neutrino (ve) provide 
the paradigm. Two further sets or generations have been discovered: 
the muon (u) and muon-neutrino (v,,) and the tau (7) and tau-neutrino 
(v+). To account for the non-observation of decay modes such as u —> ey 
and T —> uy, each generation of lepton pairs is given a lepton number, 
Ly. For each lepton, there is a corresponding antiparticle with opposite 
signs for charge Q and lepton number. The properties of the leptons are 
summarized in Table 1.1. 

As will be explained in Chapter 8, the number of neutrino species, 
N,, is given by the width of the Z? vector boson and is consistent with 
a value of 3. 

Strongly interacting particles (hadrons) are composed of quarks and 
antiquarks bound tightly in qq (meson) or qqq (baryon) combinations by 
the colour field of QCD. Free quarks have never been observed directly, 
although there is evidence that they may become unbound in a quark- 
gluon plasma, which is being studied using heavy-ion collisions. Quarks 
carry fractional electric charge, Ze or —¥e, where e is the charge of the 
positron. There are six quarks, grouped in charge (3, —3) pairs: (d, u), 
(s,c), (b,t). The (d, u) pair form a strong isospin doublet. The details 
are given in Table 1.2. All quarks have J? = ane It is worth noting 
that the three pairs of quarks (d, wu), (s,c), (b, t) of increasing mass are 
matched by the three lepton pairs (e, ve), (H, Vu), (T, Vr). 


State Q Mass Le Ly Lr Lifetime 

eT —1 0.511 MeV +1 0 0 >4.6 x 1076 years 
Ve 0 < 2eV +1 0 0 Stable 

wo -1 105.7 MeV 0 +1 0 2.197034(21) x 10-%s 
Vy 0 < 0.19 MeV 0 +1 0 Stable 

tT  -1 1776.8240.16MeV 0 0 +1 (290.641.0) x 107! 
Vy 0 < 18.2 MeV 0 0 + Stable 


Table 1.1 Lepton properties. 
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Q Mass I Tz 
1 1 1 
q an = Ne a ae a 
d 3 4.1-5.8 MeV 5 5 
2 1 1 
= 1.7-3.3M += = 
u +3 7-3.3 MeV 5 + 
1 
S -3 80-130 MeV 0 0 
2 
c +3 1.18-1.34 GeV 0 0 
1 +0.18 
b —3 4.195 9g GeV 0 0 
2 
t tz 172.0 + 1.6 GeV 0 0 


Table 1.2 Quark properties. 


1.2.4 Forces 


The force carriers of the Standard Model occur in two independent 
sectors: 


e the electroweak (EW) sector, with the photon, WE and Z° linked 
by the U(1)@SU(2) symmetry; 

e the strong quantum chromodynamic (QCD) sector, with an SU(3) 
colour octet of massless gluons. 


All force carriers in the Standard Model are spin-1 bosons and their 
properties are summarized in Table 1.3. 


Sector Q Colour charge Mass Width JA 
EW Wwe +1 0 80.399(23) GeV 2.085(42) GeV 1 
Zz 0 0 91.1876(21) GeV 2.4952(23) GeV 1 
y 0 0 0 0 (stable) 17 
Strong g 0 SU(3)colour 0 0 (stable) 17 
octet 


Table 1.3 Force carriers. 
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For the symmetries to be exact, the particles are assumed to be ini- 
tially massless, with the Higgs mechanism being invoked to generate 
particle masses (for all except the photon and gluons) while preserving 
the underlying symmetry structure. This mechanism requires the exist- 
ence of a Higgs particle and there is now strong evidence from the Large 
Hadron Collider at CERN for the existence of at least one Higgs boson. 
The details of this key discovery for completing the Standard Model are 
covered in Chapter 12. After a long shutdown between 2013 and 2015, 
the LHC will operate at the higher centre-of-mass energy of 13 TeV and 
with higher luminosity. Apart from studying the Higgs in greater detail, 
much effort will be devoted to the search for evidence of physics beyond 
the Standard Model. 

The 3-fold colour quantum number was introduced to allow baryon 
wavefunctions, for example that of the A++ composed of three u quarks 
(spin 3, isospin 3), to have simultaneously the correct permutational 
symmetry and satisfy Fermi—Dirac statistics. QCD provides the theor- 
etical basis for why only the ‘colourless’ qqq and qq combinations form 
‘confined’ hadronic bound states. A major difference between QCD and 
QED is that the force carries are ‘charged’, in that the gluons carry a col- 
our charge. Consider a gg meson and all the possible colour combinations 
that a gluon exchanged between the q and q might carry. With r, b, g de- 
noting the SU(3)colour charges, from (r, b, g) Q (7, b, g), one might expect 
nine coloured gluons. However, the three colour-neutral combinations 
(r7, bb, gg) have one totally symmetric combination zT +bb+ gg). In 
group-theoretical language, this corresponds to combining the 3 and 3 
representations of SU(3) colour: 383 = 861. The totally symmetric com- 
bination corresponds to the ‘1’ and would be colourless and unconfined, 
so it is discarded. The remaining octet of coloured gluon states are 


(rF + bb — 2g9). 


rb, rg, bg, br, gr, gb, —-k(rrF— bb), 


Note that there are two apparently colourless states. However, these 
two colour states are analogous to the electrically neutral members of a 
strong isospin multiplet (e.g. a p°)—they are not colourless. 


1.3 Diagrams et s, 


Two sorts of diagrams are used in this book: Feynman diagrams and 

‘quark-flow’ diagrams. Richard Feynman invented a very elegant graph- wW 
ical formalism that provides a considerable shortcut in calculations. 
Feynman rules make a direct connection between each element of a 
diagram and a component of the mathematical expression describing 
the process, derived from quantum field theory. An example of a Feyn- 
man diagram for the process etd > Deu via W* exchange is shown in Fig. 1.1 Feynman diagram for 
Fig. 1.1. Feynman’s graphical technique was invented in the 1940s during etd — Deu via Wt exchange. 
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Fig. 1.2 Lepton pair production by 
the Drell-Yan process m7 p —> nété-. 


Fig. 1.3 Accelerators and colliders in 
use between 1950 and 2020: hadron— 
hadron (diamonds), electron—positron 
(boxes), and HERA electron—proton 
(triangle). 


the heroic age of relativistic quantum field theory calculations of elec- 
tromagnetic interactions. More details and an outline of the ‘Feynman 
rules’ on how to construct a diagram are given in Section 7.2.2. 

An example of a quark flow diagram is shown in Fig. 1.2. It shows the 
so-called Drell-Yan process 7~ p > né*€—. Quark flow diagrams are not 
an exact calculational tool. However, they are very useful for explaining 
and understanding what is happening in a particle interaction. They 
also enable one to keep track of charges and other quantum numbers 
such as strangeness that may be changing but have to satisfy overall 
conservation laws. 


1.4 Accelerators, colliders, and detectors 


This section covers the essential ‘tools of the trade’ for high-energy 
particle physics. 

Accelerators were first developed for high-energy nuclear physics in 
the 1930s: both ‘linear’ electrostatic devices and the first circular accel- 
erators. After the Second World War, new technology enabled a huge 
increase in beam energies. A summary of accelerators and colliders in 
operation over the period 1950-2010 is shown in Fig. 1.3. 

Detector technology has also changed a lot owing to the development 
of microprocessors and advanced circuit board design—some of it driven 
by the demands of the video-gaming industry. 


1.4.1 Accelerators 


The earliest particle accelerators were based on the use of a single very 
large potential difference to accelerate a charged particle: Van de Graaff 
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(1929) used a dielectric belt to transfer charge from a voltage source to 
a large spherical isolated upper terminal; Cockcroft and Walton (1937) 
used a series of stages to ‘multiply’ the voltage. The maximum en- 
ergy was limited by electrical breakdown, typical maximum accelerating 
voltages being around 25MV. Both technologies are still in use: Van 
de Graaff machines for research in nuclear physics and the Cockcroft- 
Walton multiplier as an early accelerating stage after the ion source in 
high-energy facilities. 

It was soon realized that to get to ever higher energies, a circular 
device would allow the accelerating element to be used more than once. 
Ernest Lawrence pioneered the early development of circular acceler- 
ators in the 1930s. His machine—the cyclotron—had a circular beam 
with a radius that increased as it was accelerated, and the whole device 
was enclosed in a single large electromagnet. The largest cyclotron that 
Lawrence built had a diameter of 1.5m and produced an 8 MeV proton 
beam. 

In a modern accelerator, dipole and quadrupole magnets are used 
for bending and focusing, and microwave cavities for accelerating the 
particles. A key technical advance was the discovery of ‘phase stability’, 
which synchronized the accelerating voltage frequency and magnetic field 
strength with the rotation of the particle beam. This enabled circular 
machines with a beam pipe of fixed radius first to accelerate the beam 
(electron or proton) and then maintain it at its required operating en- 
ergy. Another key advance was ‘strong focusing’, which allowed the use 
of much smaller vacuum pipes and hence made much larger accelerators 
affordable. 

The first large proton accelerator at CERN!4—the Proton Synchro- 
tron (PS)—has a diameter of 200m and a maximum beam energy of 
28 GeV. Remarkably, the PS, which started operating in 1959, is still 
a key component of the CERN complex of accelerators. For particle 
physics experiments, a high-energy proton beam is extracted from the 
PS and then directed at a target and detector. ‘Secondary’ beams of 
relatively long-lived particles such as pions and kaons can also be pro- 
duced from the first target and selected by more magnets and particle 
identification devices. Producing neutral-particle beams is a bit more 
challenging, since they cannot be steered by electromagnetic methods. 
The production of neutrino beams is discussed in Chapter 11. 

Synchrotrons also accelerate electrons, such as in the original 7.5 GeV 
machine—the Deutsches Elektronen-Synchrotron—that gave the DESY 
laboratory in Hamburg its name. An extracted electron beam can be 
used directly for experiments or to produce a secondary photon beam 
by bremsstrahlung.’° Any remaining e* particles can be swept from the 
path of the photon beam by magnets before the photon beam reaches 
the target. 

Energy loss by synchrotron radiation from an electron beam moving 
in a circular orbit provides additional problems for the accelerator physi- 
cist.1° The rate of energy loss by synchrotron radiation is discussed in 
more detail in Chapter 3. Its effect is to limit the maximum energy at a 


14The very first accelerator at CERN 
was the much smaller synchro- 
cyclotron with a maximum beam 
energy of 600 MeV. 


15 Literally ‘braking radiation’, the pro- 
cess is et — ey, in which a high- 
energy electron emits a photon as it 
traverses a thin layer of matter and is 
deflected by the positive nuclear charge. 
The closely related process of ‘pair cre- 
ation’, y — ete, will also occur; 
together, the two processes give rise 
to an electromagnetic ‘shower’ of e+ 
particles. 


16 synchrotron radiation is essentially 
the same basic physical process as 
bremsstrahlung, but for much lower- 
energy photons (roughly X-ray ener- 
gies). It is enormously useful for in- 
vestigating the structure of materials, 
and synchrotron light sources (e.g. Dia- 
mond in the UK and the European Syn- 
chrotron Radiation Facility in France) 
are examples of very practical spin-offs 
from high-energy particle physics. 
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17 The discovery of the J/ simul- 
taneously by the Alternating Gradient 
Synchrotron (AGS) at Brookhaven and 
the SPEAR ete collider at Stanford 
in that year did indeed cause a revo- 
lution in our understanding of particle 
physics and in particular provided cru- 
cial experimental evidence in support 
of quantum chromodynamics as the 
theory of the strong interaction. 


given radius. This can be countered by increasing the radius—with the 
ultimate result being a linear accelerator. Physicists at Stanford Univer- 
sity first developed MeV-scale linacs for nuclear structure physics in the 
1950s. Somewhat later, the Stanford Linear Accelerator Center (SLAC) 
was set up to build and operate a two-mile-long electron linac—the long- 
est to date. It started operating in 1966 with a maximum beam energy 
of 20 GeV, which increased to nearly 50 GeV before high-energy physics 
experiments ceased in 1998. 


1.4.2 Colliders 


What matters for the study of the physics is the centre-of-mass (CMS) 
energy. All the above accelerators produced beams for ‘fixed-target’ 
experiments—as the name implies, the target is stationary. A large frac- 
tion of the beam energy has been used simply to accelerate the CMS 
frame in the laboratory frame of the accelerator. To exploit the max- 
imum energy available from circular accelerators, one needs to have two 
counter-rotating beams and collide them head-on or nearly so. If the 
two beams are proton and antiproton or electron and positron, the same 
beam pipe can be used. An ete~ collider with beams of 20 GeV gives 
40 GeV in the CMS frame. 

The key challenge for colliders is to achieve sufficiently high luminosity 
to provide useful interaction rates. This difficulty has been solved as 
described in Chapter 3 and most of the major discoveries in particle 
physics in the past 40 years have been made at colliders of different types. 
The only exception is for physics requiring a particular type of incident 
particle rather than just a large interaction energy. For example, CP 
violation was discovered and studied in great detail using kaon beams. 
The most important such examples in recent years are the high-energy 
neutrino beams produced at CERN, Fermilab, and J-PARC in Japan. 

Apart from the LHC, much of the experimental information covered 
in this book has come from colliders operating in the 20 years leading up 
to the start of data-taking at the LHC (see Fig. 1.3), in particular the 
Large Electron—Positron (LEP) collider at CERN (a 27 km circumference 
ring now containing the LHC) with CMS energies up to 209 GeV, the 
Tevatron proton—antiproton collider at Fermilab (4-mile circumference) 
with CMS energies up to 2 TeV, and the HERA electron—proton collider 
at DESY (11km circumference) with 27.5 GeV electrons on 920 GeV 
protons providing a maximum CMS energy of 318 GeV. Older machines, 
particularly the ete~ colliders (PEP and PETRA), provided data on 
charm and beauty states after the 1974 ‘revolution’.!” The latter were 
then studied in great detail at the dedicated ‘B-factories’: KEKB in 
Japan and PEP-II at SLAC. 


1.5 Detectors 


In the era of fixed-target experiments, the bubble chamber was one of the 
most important types of detector. As the name implies, it exploited the 


fact that boiling could be initiated in a superheated liquid by the passage 
of a charged particle. The liquid was kept under pressure until just before 
the beam arrived, at which time the chamber was expanded and then 
illuminated and photographed. The chamber was surrounded by a mag- 
netic field to bend charged-particle trajectories, thereby allowing their 
momentum to be determined. Bubble chamber pictures still provide a 
very good visual aid to understanding the kinematics of high-energy par- 
ticle collisions. To obtain quantitative information, it was necessary to 
scan the pictures manually using specialized measuring tables to digit- 
ize the tracks. Bubble chambers could only work with pulsed beams, 
and many were filled with liquid hydrogen, requiring very sophisticated 
cryogenic engineering and safety systems. An enormous bubble cham- 
ber known as Gargamelle (a 4m long, 2m diameter cylinder weighing 
1000 tonnes) was filled with 18tonnes of Freon (a refrigerant) for neu- 
trino interactions. This device was designed to find evidence for ‘weak 
neutral currents’, which led the way to the discovery of the Z°. 

The alternatives to a visual device like a bubble chamber are electronic 
detectors. Devices such as spark and drift chambers give reasonably 
good spatial information on charged-particle tracks and most import- 
antly can cope with a much higher interaction and read-out rate than 
a bubble chamber. Drift-chamber technology provided a sophistication 
that made the big devices used in collider experiments almost the equiva- 
lent of an ‘electronic bubble chamber’. However, silicon detectors offer 
much better resolution than drift chambers and they have now become 
the detectors of choice for the inner detectors at LHC, although wire 
chambers are still required for the very large areas in muon chambers. 
The energies of both charged and neutral hadrons can be measured by 
a calorimeter—a device providing an electronic signal proportional to 
the energy deposited. A traditional design is the sampling calorimeter 
with plates of a heavy material (such as lead) separated by space for 
a charge-sensitive detector. The latter can be based on liquid argon or 
another inert element like krypton or it can be a plastic scintillator 
with a photomultiplier readout. Calorimeters are also crucial for provid- 
ing information that allows the separation of ‘electromagnetic particles’ 
(electrons and photons), hadrons, and muons. These subjects are covered 
in more detail in Chapter 4. 


1.6 Open questions 


From the 1960s on, the increases in the energy of accelerators and col- 
liders and in the sophistication of detectors, together with powerful 
computers to analyse the data obtained, have provided a huge num- 
ber of hadronic states with well-defined mass, width (or lifetime), spin, 
parity, and decay modes. The information is regularly updated and pub- 
lished in the ‘Review of Particle Physics’ by the much respected Particle 
Data Group Collaboration.!® Initially, the information was summarized 
on a card small enough to fit into a wallet. Now even the summary 
(the ‘Particle Physics Booklet’) is the size of a pocket diary and the full 
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18 Available online at http://pdg.lbl. 
gov/ or the PDG UK mirror site 
http://durpdg.dur.ac.uk/Ibl/. 
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19See for example Steven Weinberg’s 
“Towards the final laws of physics’, in 
the 1986 Dirac Memorial Lectures—the 
details are in the further reading. 


Review is a hefty journal volume of well over 1000 pages. The Standard 
Model gives a very good description of this wide range of experimental 
information on elementary particles and their weak, electromagnetic, 
and strong interactions covering an energy scale up to of order 1 TeV. 
However, it is certainly incomplete.!? Gravity is not included and there 
is no explanation of why there are three ‘generations’ of quarks and lep- 
tons. There are nearly 20 parameters (masses, coupling constants, and 
mixing angles) that are not given by the Standard Model but have to be 
determined from experimental measurements. Other big questions crowd 
in—one of the most glaring is the gross matter-antimatter asymmetry 
of the world we inhabit, in contrast to the matter—antimatter symmetry 
that occurs naturally in the Standard Model. According to astrophysical 
and cosmological evidence discussed in Chapter 13, ordinary baryonic 
matter constitutes only 5% of the universe. There is an expectation that 
answers, or at least some initial directions, will be uncovered by the 
experimental programme of the LHC and future neutrino experiments. 


1.7 Chapter outline 


Physics is an experimental science, and we would not have our cur- 
rent understanding of particle physics without the use of advanced 
particle accelerators and detectors. Chapter 3 introduces particle ac- 
celerators and explains some of the critical technology required for the 
successful operation of the LHC. Chapter 4 gives an introduction to 
the fundamental physics of particle detectors, with an emphasis on the 
modern techniques used at the LHC. This is obviously a vital subject 
for particle physics and the chapter attempts to describe this in greater 
depth than conventional undergraduate textbooks. The applications of 
these principles to particular experiments will be described in other 
chapters. 

Some of the theoretical and mathematical concepts such as symmet- 
ries that will be required throughout the rest of the book are covered in 
Chapter 2. An introduction to relativistic quantum mechanics is given 
in Chapter 6. The static quark model for hadrons is described in Chap- 
ter 5. The use of scattering experiments to probe the dynamic nature 
of quarks is covered in Chapter 9, which gives an outline of the quark— 
parton model as well as the evidence for gluons and a brief introduction 
to quantum chromodynamics (QCD). The weak interaction is introduced 
in Chapter 7, starting with the weak interaction of leptons. This is then 
extended to include quarks and the chapter ends with an introduction 
to electroweak (EW) unification, including the prediction of the W and 
Z bosons. A wide range of experiments that support EW unification 
are covered in Chapter 8, particularly those made possible by the high 
energies of the LEP, Tevatron, and LHC. Flavour oscillations and CP 


violation in the quark sector are explained in Chapter 10. Similar os- 
cillations are seen in the neutrino sector, and the formalism and key 
experimental results are described in Chapter 11. The intriguing possibil- 
ity that CP violation in neutrino oscillations could explain the observed 
matter—antimatter asymmetry in the universe is briefly reviewed. The 
Higgs mechanism is a fundamental aspect of the Standard Model. The 
theory and experimental evidence for the existence of a Higgs boson are 
given in Chapter 12. Finally, Chapter 13 starts with a review of Stand- 
ard Model physics at the LHC; it then explains why this is not the end 
of the story. Although the Standard Model is remarkably successful in 
explaining the current LHC data, there remain compelling reasons to 
believe that it is an incomplete theory. These are outlined in this chap- 
ter, together with a discussion of theoretical ideas beyond the Standard 
Model that might cure some of its problems. The evidence for dark mat- 
ter and dark energy is also reviewed, as well as the different attempts to 
discover dark matter. 


1.8 How to read this book 


As we mentioned in the Preface, we have the ambitious aim of cover- 
ing all aspects of particle physics. This inevitably means that not all 
topics will be of equal interest to all readers. The next three chap- 
ters provide technical information: Chapter 2 on mathematical methods; 
Chapters 3 and 4 on accelerators and detectors, respectively. Depend- 
ing on the reader’s interest or experience, these can skipped or returned 
to later. Similarly, Chapter 6, an introduction to relativistic quantum 
mechanics and the Dirac equation, is not essential for understanding 
many of the experimental results, but it is crucial for understanding the 
concept of antiparticles. Chapter 5 shows how the static quark model 
can explain the observed pattern of hadronic masses and quantum num- 
bers. The second half of the book (Chapters 7-13) covers all aspects of 
experimental particle physics, informed by recent results from the LHC 
and other particle accelerators and experiments. If you need to use the 
most recent and accurate experimental results, you should consult the 
PDG tables [115]. 


Chapter summary 


e Aim of the book—a practical introduction to particle physics. 
e The basic building blocks—leptons and quarks. 

e The electroweak force and the photon, W*, and Z°. 

e The strong force, QCD, and gluons 
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Further reading 


e Close, F., Marten, M., and Sutton, C. (1987). The e Feynman, R. P. and Weinberg, S. (1987). Elem- 
Particle Explosion. Oxford University Press. entary Particles and the Laws of Physics: The 

e Pais, A. (1986). Inward Bound: Of Matter and Forces 1986 Dirac Memorial Lectures. Cambridge University 
in the Physical World. Oxford University Press. Press. 


Mathematical methods 


This chapter covers rotational and Lorentz invariance. Related space- 
time symmetries such as parity, time reversal, and charge conjugation 
are also defined. We give a brief introduction to group theory and its use 
in the mathematical construction of the standard model—both in the 
electroweak sector and in the strong interaction. The idea and usefulness 
of approximate internal symmetries are explored, using nuclear isospin 
as an example. This chapter also covers the essential steps in connecting 
calculations to measured quantities such as cross sections and decay 
rates. It is assumed that the reader is familiar with the quantization 
of angular momentum in non-relativistic quantum mechanics and has 
taken a first course in special relativity. 

While investigating the invariance of general relativity in 1918, Emily 
Noether determined the conserved quantities for all physical laws that 
are based on a continuous symmetry. Specifically, there are the following 
associations between symmetries and conserved quantities: 


Symmetry Conserved quantity 
spatial rotation + angular momentum 
temporal translation +> energy 
spatial translation +> momentum 
electromagnetic gauge invariance + electric charge 


Three discrete symmetries are also important in nuclear and particle 
physics: spatial parity (P), charge conjugation (C), and time reversal 
(T). All three are good symmetries of both the electromagnetic and 
strong interactions. The weak interaction famously breaks both C and 
P symmetries maximally but is C P-invariant for many processes. Vio- 
lation of C'P invariance has been observed in the interactions of neutral 
meson systems, particularly kaons and beauty mesons.’ The product of 
all three, CPT, is expected to be a universal symmetry of physics and 
is a cornerstone of quantum field theory. 


2.1 Discrete symmetries 


2.1.1 Spatial parity 


The parity operator performs a spatial inversion though the origin: 
Y (x, t) = Py(x, t) = w(—x, t) 
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Applying the parity operator twice must return the original state: 
PP (x,t) = v(x,t), so P?=1 
To preserve the normalization of the wavefunction, 
(wld) = Ww’) 
= (Y|| Pİ PII) 
Therefore, 
PİP=I (P is unitary) 
and since P? = 1, PP (P is Hermitian) 


which implies that parity can be an observable with eigenvalues +1. 

Parity changes the direction of vector quantities (r, p) but conserves 
quantities that are products of vectors, such as j = r x p. Furthermore, 
since P has no explicit time dependence, 


.d(P) 
poe 
dt 
Therefore, parity is a constant of the motion (i.e. conserved) when the 


interaction Hamiltonian commutes with P. 
We define the following intrinsic parity of fundamental particles: 


= [P, H]. 


e Spin-1 bosons: Gluons and the photon have intrinsic parity 
P=-1: 
Ply) == —|7) and Plg) = —|9) 


° Spin-4 fermions: particles are of opposite parity to 
antiparticles—this follows from the Dirac equation (see Chapter 6). 
The conventional choice is 


Ple~) = P|v) = P|q) = +1 
Ple*) = P|?) = Pig) = -1 


The weak gauge bosons (Z° and W*) are not eigenstates of spatial 
parity and thus do not have a definite parity quantum number. 


2.1.2 Charge conjugation 


The charge-conjugation operator C changes a particle into its antipar- 
ticle. Generally, few particles are eigenstates of C, for example a u quark 
has charge +2, its antiparticle —2. Nevertheless, C is a useful quantity 
when considering electromagnetic or strong decays of neutral mesons. 

The photon is a neutral, fundamental particle. Its intrinsic charge 
quantum number can be inferred by considering its correspondence with 
classical wave theory. It is clear that upon reversing charge, the electric 
field and the electromagnetic scalar potential change sign: 


E(x, t) > —E(x,t), (x,t) > —¢(x,t) 
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The vector potential A(x,t), which is connected with the photon 
wavefunction, is related to E and ¢ by 
OA 
E = -V¢- — 
6 Ot 
Inserting the charge-reversed E and ¢ requires 
CA=-A 
It is then simple to identify C|x°) = +1|r°) from the dominant 
electromagnetic decay 7° + yy (branching ratio = 98.8%). 
2.1.3 Time reversal 
In line with the spatial parity transformation, we might expect that time 
reversal would be given by 
wr (x, t) a Ty(x, t) _ nr w(x, —t), where In| =1 
Both classical mechanics and electromagnetism respect time reversal. 
In classical mechanics, for a time-independent potential V(x), New- 
ton’s equations of motion can be derived from the energy-conservation 
equation? ?We work in one spatial dimension for 
simplicity. 
1 
E= gin + V(a) 


by differentiation with respect to time, giving 


ie dv. ie o da dV 

mie + —t=0, or m— = —-— 

da dt? dx 
so the equation of motion is unchanged by the change t + —t. However, 
this will not be correct for quantum mechanics, because Schrédinger’s 
equation involves a first-order time derivative. For a time-independent 

Hamiltonian H and w an eigenstate of energy, 
Oy 


in =H = Ed p(z, t) = y(x, 0)e P/* 


If we apply the T operator as defined above to this equation, we find 


T((2,t)) = nry(x, 0jet7t/ P, where |nr| =1 


The time-reversed state appears to have negative energy.2 What mat- ’We will consider another view of 


ters in quantum mechanics is what it predicts for observable quantities, "¢8ative-energy states in Chapter 6. 


and this requires calculating normalized matrix elements, for which we 
need y* (x,t) = U*(a, 0)eti#*/” as well as ~(x,t). Looking at the time 
dependence of this state gives a hint as to how to proceed: we modify the 
T operator so that, in addition to requiring the change (x,t) > (x, —t), 
we demand that Y > Y*, so now 


br(x,t) = vr(a,t) = nry” (x, -t) = nry” (x, 0)e ~i 
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Such an operator is known as anti-unitary. It is straightforward to show 
that the normalization of ~(x,t) is unchanged by the T operation, 
provided that |nr]? = 1. 


2.1.4 JPC of hadrons 


Parity and, where applicable, charge quantum numbers are often quoted, 
for a particular state, with its total angular momentum J = L + S as 
a JPO number. The J” of a particle is closely related to the spatial 
transformation properties of the state. Particles with J? = 07 are called 
pseudoscalar particles and those with J? = 0+ scalar. Particles with 
JP = 17 are called vector particles and those with J?’ = 1+ axial 
vector. 


Mesons 


Mesons are qq bound states. As these have opposite parity, a ground- 
state meson will always have P = —1. Excited states bring additional 
parity factors according to (—1)*: 


Pmeson = (—1)"*? 
C is defined (for light neutral mesons only) by interchanging q & q and 
swapping their positions and spin: 

Cineson = (ier 
Baryons 


Baryons contain three quarks and as such cannot be their own antipar- 
ticle; C is undefined. The calculation of baryon parity is more complex 
than for mesons because one must consider the angular momentum of 
a three-body system. The intrinsic parity of a baryon is (+1)? = +1; 
similarly, it is —1 for an antibaryon. In full, 


Praryon = B(-1)" (-1)"s 


where B is the baryon number, Liz is the angular momentum between 
quarks 1 and 2, and L3 is the angular momentum of the third quark 
relative to the 1-2 pair. 


2.1.5 Useful examples 


e What is the JPC of a 1°? 
The 7° is in the ground state, with L = S = J = 0. From the 
formula, P = —1 and C = +1. So JPC =0-+. 

e Is P + ntr” an allowed decay? 
This is an excited initial state, with S = J = 1. L = 0, so P= —1. 


Also, from above, C = —1, so JPO = 175. 
The pair of pions have intrinsic parity +1 from (—1),+(—1),- and 
each J = 0. 


So producing them in an L = 1 state will simultaneously conserve 
J, P and C: 


2.2 


e What about p}? — n?n? ? 
This would be similar to the previous example, requiring the pion 
pair to be in an L = 1 state. 
However, applying C to the final state has no effect:+ 

Cln°n®) = |n°n, C =+, JPO (n?n?) =1-+ 

So this process is forbidden by charge violation with the strong or 
electromagnetic force. 
Indeed, it has never been observed. 

e Poe 
This is similar to p? > ntr except that the t47 system has 
intrinsic parity —1 from (+1),-(—1)¢+. Producing a final state 
with the spins aligned, S = 1, simultaneously conserves J, P, and 
C. Note that this process only proceeds via the electromagnetic 
force and is therefore only ~ 1074 as likely as the 7*a~ mode. 

e p? -7 ney 
The initial state is JP” = 177. The photon has JPO = 177 and 
the 7° has 07+. 
But, in addition to their intrinsic parity, photons carry parity 
(—1)” away from a system. 
Therefore, this electromagnetic decay is allowed. 


e What is the JP? of a Kt 
|5u) is not an eigenstate of charge inversion, so the C quantum 
number is undefined. 
Therefore, the label becomes J” = 07 (ground-state kaon). 


2.2 Addition of angular momentum 


2.2.1 Angular momentum in quantum mechanics 


In quantum mechanics, the angular momentum operators Jz, Jy, Jz do 
not commute, but they do satisfy the commutation relations 
[dis J;] = legge Ik (2.1) 
where €;;, = +1 if ijk is an even permutation of xyz (e.g. yzx), —1 if it 
is an odd permutation (e.g. zyx), and 0 if any of the indices are identical 
(e.g. yx). The only operator that commutes with Jy, Jy, Jz is the the 
total angular momentum J?: 
and [J*, Jj] =0 


J = J2 + J+ Fi, (i= 2,y,2) 


Eigenfunctions of J? and J, are labelled with the eigenvalues of J? 
and J,: 


J’ ljm) = jG + Dj, m) 
Jz|j,m) = m|j, m) 
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4 Applying charge conjugation to a sin- 
gle neutral pion gives C|7° = |r? , there- 
fore applying the charge conjugation 
operator to two neutral pions give s 
a factor of one. Note that this decay 
mode is also forbidden by Bose-Einstein 
symmetry: the two pions must be in an 
L=1 state to conserve angular momen- 
tum but this would require the wave 
function to be anti-symmetric with re- 
spect to exchange of identical bosons. 
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where m € {—j,j+1,...,j—1,7}. So, for a given j value there are 2j +1 
values of J+, with the states related by raising and lowering operators 


Ja, = Jn +iJy (2.2) 


which, from eqn 2.1, satisfy 


[Je, Ja] = 4J (2.3) 


Using this commutation relation, we have 
Jz J-|j, m) = (J_Jz a J_)|j,m) 
= J_(Jz—1)|j,m) 
= (m= 1) J-|j,m) 


Similarly, m + 1 is the eigenvalue of J, applied to the ‘raised’ state 
J4|j,m). So we can generally write 


Jlj, m) = C+ (j, m)|j,m + 1) (2.4) 


with the boundary condition that C+ is required to be 0 for J_|j, —j) 
and J4lj, +j). 
To derive these constants, we note that J; and J_ are Hermitian 
adjoints (although they are not Hermitian operators): 
(i, m|J}|j,m + 1) = (j,m|J_[j,m +1) 
= Coy, m+ 1), mij, m) 
= C_(j,m +1) (2.5) 


We take the complex conjugate of this equation to give 


so the lowering coefficient of the m + 1 state is the same as the raising 
coefficient of the m state: 


C_(j,m+ 1) = C (j,m) =C (2.6) 
Therefore, applying both operators successively, 
J-J4lj,m) = C7|j,m) (2.7) 
where the double operator can be broken down to 


J-J} = J2 + J? Hi gd yds) 


= Ji + Ji 4+ J? — J + i[Jo, Jy] 
= P—P-J, 
=J* —J,(J,+1) (2.8) 


2.2 Addition of angular momentum 21 


from which we can easily identify the eigenvalues of C? and the 
coefficients C 


C? = j(f +1) —m(m +1) 
C_(j,m) = VG +1) — m(m— 1) (2.9) 
C4 (jm) = Vil +1) —m(m +1) 


2.2.2 Addition and Clebsch—Gordan coefficients 


Let Jı and J2 be two angular momentum operators, for example the 
orbital angular momentum and spin of a particle: 


JÈ |j mi) = j (ji + VI j1,™m1) 
Jz j2, m2} = j2(j2 + 1)|j2, m2) 


(2.10) 
Jiz |ji; mi) = mi |ji, mı) 


J2z |j2,M2) = Me2|J2, M2) 


We define the combined operators 


J? = (Jiz + Jaa) + (Jiy + Jay)? + (Jiz + Jazz)? 
J, = diz or Jaz 


acting on 


wW = |j1, mi) lj2, M2) 


wy is an eigenfunction of J, with eigenvalue m = mı + mg but it is 
generally not an eigenfunction of J*. However, linear combinations of Y 
can produce eigenfunctions Ù (j, m, j1, j2) of J? with eigenvalues j (j +1): 


J1 J2 
v= X}, Caa(jm,m,m)4 (2.11) 


mı=—jı M2=—J2 


The coefficients Cj, ;, (j,m,m1,m2) are the Clebsch-Gordan®  °Clebsch and Gordan were nineteenth- 
coefficients. They are the probability amplitudes that we measure, century mathematicians who identified 
from a combined system of |j1,™1)|j2,m2), with a combined angular a ae een Te raei 
momentum of j(j +1) when applying the operator J?. The details can 

be found in quantum mechanics texts; here we will consider a useful 

example. 


2.2.3 Calculation of Clebsch-Gordan coefficients 


Consider two particles |j1, m1) and |j2, M2) forming a combined state 
|j,m), where jı = 1 and jg = 4. Evidently, j can be $ or 3. The 


2 
following two points are key: 


e Coefficients should normalize to 1. 


e On raising (lowering) a state to an m higher (lower) than J (—J), 
the resultant state = 0. 
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3 
— states: 


Evidently, the states with maximum (3) and minimum (—3) m are 


Using the definition C4 (j, m) = Vilj + 1) — m(m + 1), we have 


11 1 1 
laa) = 152) 


J_ |1, 1) = V2|1,0) 
J_|1,0) = V2|1,-1) 
J_|1,-1) =0 


Now operating on the combined state, we have 


33 1.1 
J- $5) =li) 


vai TERE) ' 1.2). 5) 


13 


2 
3 1 2 11 1 1. 
siyo 5.4) (inn fh 5) 


In an analogous way, 


: tat 
z states: 
2 


cle i, al 1-1 
a yaah aa LOIS 
$a) = elt) |5.—5) +611, [5,5) 
with a? + b? = 1 because of normalization. We now apply J4: 
11 
esp eer, 
P 


1 1 11 


J4 


Therefore, a + /2b = 0, which with a? + b? = 1 gives a = "A and 


O 


These results are summarized in Tables 2.1 and 2.2. 
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ret l3 8 2 2 3 3 
2 i 2 2 2 2 2 2 
3 1 1 1 1 
m =Z L L 
2 2 2 2 2 2 
My m2 
1 
+1 Ta 1 0 0 0 0 0 
1 1 2 
yi =2 ó z = 0 0 0 
2 3 3 
1 2 1 
0 = 0 = —4/ = 0 0 0 
t3 3 3 
1 1 2 
0 —= 0 0 0 = = 0 
2 3 3 
1 2 1 
—1 = 0 0 0 —4/ = = 0 
+3 3 3 
1 
—1 = 1 
5 0 0 0 0 0 
Table 2.1 Clebsch—Gordan coefficients for jı = 1 and j2 = 5. 
1 1 
=< T 1 1 1 
5 ® 5 J 0 
m 1 0 0 —1 
mMı m2 
1 1 
co = 1 
5 +5 0 0 0 
1 1 1 Í 
+5 -3 p 2 i : 
1 1 1 1 
= +5 ý 2 -3 j 
1 1 
-= —= 0 0 0 1 
2 2 


Table 2.2 Clebsch—Gordan coefficients for jı = 4 and jg = L. 


2.3 Spatial rotations 


Consider a small rotation € about the y axis, 


x = R,(e)x 
x’ cose 0 sine x 
y] = 0 1 0 y 
z —sine 0 cose A 
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SHere we explicitly assume that the 
non-relativistic Schrodinger equation is 
valid. The resulting conservation law 
is therefore only valid in the non- 
relativistic limit. The relativistic case 
requires the use of the Dirac equation 
and this will be discussed in chapter 6 


TSection 2.7 covers the essentials of 
group theory. 


and its inverse, 


x=R, ‘e)x’ 
x cose 0 —sine a’ 
y}={ 0 1 0 y’ 
z sine 0 cose z! 


We impose invariance: 
W(x’) = y(x) = Ry" (2)x’) 


Without loss of generality, we take a specific point x’ = a and find a 
relation between ~'(a) and w(a): 


(a) = Y(R; (e)a) 
= Y (dz Cos € — az SINE, Gy, Qz COS € + Gz SiN €) 


— W(dz — Eaz, Qy, Az + edz) as €>0 


E __ yla) dv (a) 
= w(a) + caz Ae edz 5 


oem 


i aO o 
= (1 —ieJ})y(a) since Jy Ss] («x — +5.) | 
U: (€) 


H... (Taylor expansion) 


Conservation <> invariance 


Consider the time variation of U,:° 


TOOTO) 
= | Foto] let) + OTEO) + ou fie] 
dU, 


= (6) Ge let) + (oO) Ur — HU) 


Hence, U, is invariant if Jy (the tricky part of U,) is independent of 
time and commutes with the Hamiltonian; i.e. the eigenvalues of U,(e) 
are constant. Angular momentum is conserved owing to the requirement 
that the wavefunction be invariant under rotation. 


Rotation is a Lie group’ and so any rotation can be expressed as the 
successive application of the infinitesimal rotation: 


U,(8) = lim (1-:4,,) 


n—- oo 


=e By (2.12) 


or, for a general three-dimensional rotation, 
uso) =e 


In the language of operators, the angular momentum operator J, is said 
to be the generator of rotations about the y axis. 


Euler angles 


Generic rotations can be parameterized by Euler angles, which are 
defined by three successive rotations: 


1) rotation by an angle y about the z axis; 
8 
2) rotation by an angle about the original axis; 


(3) rotation by an angle a about the new 2’ axis. 


This is inconvenient, because these definitions use two different bases. 
However, this transformation is actually the same as 


1) rotation by an angle a about the original z axis; 
8 g 
2) rotation by an angle about the original axis; 


(3) rotation by an angle y about the original z axis. 


The generic rotation of wavefunctions can be represented by 
D-matrices: 


Diy aby) = (j, m ee vei], m) 


; , ; 
= emilem tym) qi 


Rotation matrix: EE 


Although the y projection is unchanged, the z direction has changed, so 
the quantum number m is not the same—it is now projected onto a new 
z’ axis. A state |j, m) transforms under a rotation 8 about the y axis 
into a linear combination on the 2j + 1 states |j, m’): 


eaim = alim (2.14) 


Multiplying with (j,m’| gives 
diim (B) = (j, m'e? j, m) (2.15) 
Calculation of the matrix proceeds as follows: 
(1) From inspection of the expansion of e7'97", 
—ißJy _ . ee Pee eee ee 
e's = 1-1 Jy- SP + P+ Gey t}. 


(2) Look separately for solutions of J7”*"|j,m) and J3” |j, m). 
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(3) Recall the raising/lowering operators J+ = J, + iJy and the 
Clebsch-Gordan coefficients C3” = \/7(j + 1) — m(m+ 1). Then 


a= Ta Je +iJy 
i 
Jy = —5(J4 = J-) 


sO 


Jy|t,1) = —3 (C26 - CM 1,0) with Cll = v2 


i 
=p 
glo 
(4) Operate again with J,: 
i 
V2 
1 
2/2 
I 
5 (lt 1) —|1,-1)), since CMa OM a9 


(5) And again: 


Fuld, 1) = 5Jy(|1, 1) — [1,-1)) 


[Fe = I) = Fy = Eh = 


= + (311,0) + v3i1,0)) 


(6) Note the cyclical pattern and conclude that 

oi 

v2 
2n 1 


2n+1 
Jy? [1 1) |1, 0) 


7) Which then leads, for each specific (j,m’| state, to the following: 
g 


1 
Qn+1 _ 2n = 
1 
Ən+1 _ 2n — 


(107 L1) (1,0|J5”|1,1) = 0 


i 
E. 


(8) With these relations, the d’,,,,,, coefficients are calculated as 


dy = (1, ler? 1,0) 


; Les doa a 
= (1+ Leiba + (iB Jy)? + Sia 


1 
+ gib) +11) 


i 18? ipt 

E 22 "A2 

_1 | Taza Log 

a (edge det...) 
1 

= 5 (1 + cos 8) 


dy, = (1,—lfe7"?""|1, 1) 
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-3f (1 l 52 4 l ga ad] ((1, -1||1, 1) = 0 of course) 


2! 4! 


= za — cos f) 


d} = (1, 0j% ]1,1) 


i il 
= (1,0 i8 + 
a Be V23! 


=; (8- Fe +---] 


1 
= — sin f 


v2 


BP +...|1,0) 


doo = (1, Ole 77 |1, 0) 
= —iv2(1, Ole! J |L, 1) 


= cos 8 


Example: e~et > u“ pt 


With reference to Fig. 2.1, the incoming left-handed electron annihi- 
lates with the right-handed positron. In the electromagnetic interaction, 
a photon is exchanged with the outgoing muon pair. As will be dis- 
cussed in detail in Chapter 6, at high energies, helicity® is conserved 


8 Helicity, o - p/|p|, is the projec- 
tion of the spin along the momentum 
direction. 
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Fig. 2.1 Helicity conservation 
ete7 > pt p- 


Fig. 2.2 Angular distribution 
ete~ — utu, TASSO [55]. 
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in this reaction, so the final-state particles must have opposite helicity. 
The amplitudes are given by 


Au = A(epek > wp uR) « dt = =(1 + cos6) 


NI = 


1 
Aj-1 = Alepek > wget) X di1 = zQ — cos 0) 


do 


E 2 2 
dcosO [An + Aal 


x 1 + cos? 0 


The two amplitudes should be of equal intensity because of parity con- 
servation in the electromagnetic interaction. Figure 2.2 shows etTe~ > 
utp data from the TASSO experiment at DESY [55]. It is clear that 
the angular distribution is not symmetric about cos@ = 0. This asym- 
metry is evidence for the off-shell influence of the parity-violating Z° 
interfering with the dominant y exchange. 


2.4 Lorentz invariance 


Most high-energy physics requires energy scales Æ >> mMproton, SO it 
is essential that the requirements of special relativity be respected. In 
practice, this means identifying suitable 4-vectors and Lorentz invari- 
ants. Although the position and direction of particles is important when 
actually performing experiments, the results are most often derived from 
knowledge of the energy and momentum of the interacting particles. Here 
we summarize the essentials and define the Lorentz metric: 


e The components of the energy-momentum 4-vector (E, pc) of a 
particle of rest mass m are related by 


E? — |p? =m? 
e This relation also defines the Lorentz metric tensor guy or g””: 
goo = 1, 911 = 922 = 933 = —1, with all other components = 0 


e The scalar product of two 4-vectors X” = (X°,X) and Y” = 
(Y°?°, Y) is 


X -Y = X,Y" = X"Y, = gu X"Y” = g” X,Y, 


The scalar product of two 4-vectors and hence the length of a 
4-vector are Lorentz invariants. 


Consider next the relationship between the energy-momentum 4- 
vectors p = (E, pc) and p' = (E£’,p’c) of a particle of mass m in two 
Lorentz frames S and S’, where S’ is moving with a speed 8 = v/c 
along the z axis in frame S. The py and py components of 3-momentum, 
perpendicular to the boost, are unchanged, but the p, component along 
the boost direction and the energy are modified: 


E' = 7(E — Bpzc) (2.16) 
p,e = y(pzc — BE) (2.17) 
Dy = Px (2.18) 
Py = Py (2.19) 
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where y = 1/,/1— 6? and 6 = v/c is the boost. It is convenient to 
have the transformation expressed in terms of angles with respect to the 
Z axes: 


E' = (E — Bpccos@) (2.20) 
p'ccos@’ = y(pcecos 6 — BE) (2.21) 
p' sinb’ = psin 0 (2.22) 


2.4.1 Invariant variables 


Many high-energy scattering processes can be measured and analysed in 
terms of two variables: the centre-of-mass energy Ecm and the scattering 
or production angle of a ‘leading’ final-state particle 0cm. Consider a 
generic process a+b + c+ X, where a and b are either beam and target 
or colliding beam particles, c is the leading final-state particle, and X 
is the, often unmeasured, remainder of the final state. Two invariant 
variables are useful: 


S= (Pa +p), t= (Pa — pe)? (2.23) 


where Ppa, pp, and pe are the 4-momenta of particles a, b, and c. s is 
the square of the centre-of-mass energy and t is the square of the 4- 
momentum transfer. For example, consider the process 7p > 7X using 
a pion beam of energy Er on a fixed liquid-hydrogen target, for which 
the energy E} and angle 6’ of the leading final-state pion are measured. 
One finds (see Exercise 2.6) that 


s = Em = My +m + 2mpEr, (2.24) 
t = 2m? — 2(E,, E} — kk’ cos6’), (2.25) 


where k and k’ are the magnitudes of the 3-momenta of the two pions. 

For a high-energy collider with equal-mass particles (e.g. LEP with 
ete or LHC with pp), s = 4E?..,,, ignoring masses. The equivalent 
fixed-target beam energy required to give the same vys is Ep © s/(2my); 
for example to achieve Ecm = 7 TeV would require a proton-beam en- 
ergy of 7? x 10°/(2 x 0.94) ~ 2.6 x 104 TeV. Clearly, colliders are the 
most energy-efficient way to reach the highest energies. As already men- 
tioned in Chapter 1 and discussed in more detail in Chapter 3, the key 
challenge for a collider is to achieve high enough luminosity. 


2.4.2 Rapidity 


High-energy hadron-hadron interactions tend to produce final states 
with limited transverse momentum with respect to the initial beam dir- 
ection. In such circumstances, rapidity y and transverse mass mr are 
convenient variables, where 


1 E+p:z 
vom (Ee) m= nep 


The 4-momentum p of a particle of mass m, transverse momentum pr, 
and an azimuthal angle ¢, 


p = (E, pr cos ¢, pr sin >, pz) 
may be described instead as 
p = (mr cosh y, pr cos ¢, pr sin ġ, mr sinh y) (2.26) 
Rapidity has the approximate range (In(m/2F), In(2E/m)). A difference 
in rapidity is an invariant under a Lorentz transformation along the 
beam direction and rapidities are additive under Lorentz boosts along 


the beam direction. At high energies, when masses can be ignored, y 
may be approximated by the pseudorapidity 1: 


1 1 0 
y r= 5tn( e ) = ln tan(0/2) 


where 0 is the polar angle. The range of pseudorapidity is (—oo, co). 
Note that 7 = 0 (or y = 0) is perpendicular to the beam line and 
large |n| (or |y|) is close to the beam line. In high-energy hadron—hadron 
scattering, it is observed that particle production is roughly uniform in 
units of pseudorapidity. 
A useful relation follows from the Jacobian of the transformation from 

Cartesian to rapidity momentum components: 

d?p _ d — 42 2 2 2 

-p = Pr dpr do dy = d'pr dy > 7 dp dy ~ 7 dpẸ dn (2.27) 
where for the last two expressions the azimuthal angle has been 
integrated out. 


2.5 Transitions and observables 


Particle physics experiments often involve measuring decay rates or scat- 
tering cross sections—processes that involve transitions from one state to 
another. For a particle of total width I, the lifetime is given by 7 = 1/T 
and the number of particles decays exponentially: 


N(t) = N(0)e—*/7 = N(0)e~* (2.28) 


where N(0) and N(t) are the numbers of particles at times 0 and t, 
respectively.? Often a particle will decay to many final states, so it is 
useful to define [’;, the partial decay rate to final state i. The total rate 
is then given by T = 57,1, where the sum runs over all final states. 
Similarly, the attenuation of a particle beam of flux I(x) is given by 


I(x) = I(0)e~*/* (2.29) 
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With h=c= 1, both time and length 
have dimensions of energy~! and 
TI ~1 is the time-energy uncertainty 
relation. 
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10 Familiar from the classical kinetic 
theory of gases as the mean free path. 


UWith i =1andc=1. 


where Z is the collision length!® given by £ = 1/No for a target of number 
density N and scattering cross section ø. The cross section has physical 
dimensions of area, so [energy] ~? here. For a thin target of thickness ôx, 
the beam attenuation is given by ôI ~ I(0)Nod«. 

In general, the S matrix describing a scattering or decay A > B is 
written as S = 1+ iT and the reduced matrix element M(B : A) is 
defined by 


Sti = (BliT|A) = i(27)*0*(pa — pp) M(B : A) (2.30) 


where p4 and pz are the total 4-momenta of states A and B. 


2.5.1 Phase space and decay rates 


Transitions to a final state |B) from an initial state |A) are calculated 
from Fermi’s Golden Rule: 


Ty = We, = 2r (Tal? x p( Es) (2.31) 
od ~“— 


dynamics kinematics 


where 


Is; is the decay width to the final state in question; 


Wy is its transition rate: the number of transitions per unit time; 


Tr is the matrix element describing the dynamics of the transition: 
Ta = (B\V|A) (2.32) 


with |A) and |B) the initial and final states interacting via a 
potential V; 


p(E¢) is the phase-space factor. 


The phase-space factor p(Ep) 


This is the number of states available per unit of energy in the final state. 
It is important because it connects the physics contained in the matrix 
element to observable quantities. First, we shall review its calculation 
using non-relativistic quantum mechanics (NRQM) and explain why this 
is not appropriate for use in high-energy physics. 


Non-relativistic quantum mechanics 


In NRQM, the calculation proceeds as follows. Imagine a cube of sides 
L containing one of the final-state particles with quantized momentum, 
p= k and k = 2rn, where n is an integer.'! Then 


2TNg 2TNy 27N 
Px = T’ Py = L” Pz = T 
or Mg = Lpa Ny = Iy maa= Lpz 
2m’ d 2m’ = 


20 
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Each (Pr, Py, pz) Momentum state resides in the elemental volume in 
momentum space, (2r /L)? = (27)?/V, so that 


(2r)? 


total phase space = 1 


where N; is the number of momentum states available to one particle. 
Normalizing to one particle per spatial elemental volume and rewriting 
‘total phase space’ as the integral over all momenta, we have 


= 1 = 1 3 
M = ayy | twn V = Ges [ee 


Next, we scale up to n particles: 


n—1 
1 3 
Nn-1 = Ore / I? Pj 
j= 


from which!? the density of states, the number of states per unit 12s total momentum conservation 


energy, is constrains the nth particle, the number 
: of available states is that of n — 1 free 
=ï particles. 
dNn-1 1 d f) 
E) = = dĉ?p; 2.33 
AE) = E pIE II Pj a 
J= 


However, this is not satisfactory for high-energy physics, since we need 
to take into account the Lorentz contraction of the volume element in 
the usual NRQM wavefunction normalization. 


dLips 


The problem is solved by changing the quantum state normaliza- 
tion. Instead of normalizing to one particle per unit (spatial) volume 
J|? dV = 1, we use 


I \o\?dV =2E, for a particle of energy E 


For a particle of mass m with 4-momentum p = (£,p) and spin (or 
helicity) A, this corresponds to a momentum-space normalization of 


(p', N'|p, X) = 2E5yy/6°(p — p’) 


A useful identity, which shows the manifest Lorentz covariance of this 
choice, is 
dp 
—~ = 6(E)6(p* — m?) dt 
SP = 0(E)5(p? — m?) dp 


Before giving the expressions for decay rates and cross sections, there is 
a somewhat tricky mathematical difficulty to be handled. In eqn 2.30, 
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131¢ is the CMS frame of the decay 
products. 


we have a 6*(-) from overall 4-momentum conservation, which will be 
squared in the calculation. Formally, this is dealt with by using the 
identity 


(27)*64 (pp — pi) = I elt (Pe—Pi) q4y (2.34) 


to replace one of the 64(-) functions and then using pp = p; plus the 
other 5*(-) to give f dtx = VT. These VT factors cancel with those 
that occur in the use of Fermi’s Golden Rule, in which the normalized 
transition rate per unit volume and unit time appears: we; = |S¢i|?/VT. 


2.5.2 Decay rate 


The partial decay rate of a state of mass M, at rest, into n particles is 
related to the reduced matrix element M by 


(27)* 
2M 
where dLips is the n-body phase space given by 


; n n q3 : 
dLips(P; pı,- , Pn) =O (> = Sn) II GRE (2.36) 


dP (M : mi, ..., Mn) = |M]? dLips(P; p1,- .-, Pn) (2.35) 


Two-body decay rate 


The simplest case is a two-body decay X — a + b, where the masses of 
X,a, and b are M, Ma, and mp, respectively. As the final-state integral 
is Lorentz-invariant, we are free to choose any frame. The rest frame of 
M is most convenient:!3 Py = (M,0), pa = (Ea; Pa), po = (Ev, py). We 
have 


= OF f 2 Op, dP, 
Ti = “ong J Mail (27)3(2E,) 27) (2E) 


x Ô? (Pa + P,)6(M — Eq — Ep) 


Gathering constants and using the 6%(-) to remove one of the 3- 
momentum integrals, we obtain 


ies f map Leo E, — E) 
fi 3272M fl E,E, a 


In the CMS, we take p, = k and p, = —k and, using polar coordinates, 
we have 


dp, | k? dksin 0 d0 dọ œ k? dkdNQ, where k = |k| 
This gives 


1 k? dk dQ 
Pen |? M-—E,—-E 2. 
a= gay | MP ER OM- Ea-m) (237) 
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Since p, = —p, =k, E? = mẹ +k? and eqn 2.37 becomes 


1 
Pa = gycaqz | Mao) drag 


where 


k) = 
E.E 
f(k) = Mi — Vm +k? — Vm? +k? 


Denoting by p* the value of k = |p,| that conserves momentum, g(k) 


can be integrated to give!4 l4Using the relationship 
8 8 g 


=f df —1 
area (É|) se-a) 
g(k)ó(f(k)) dk = a g(k)ô(k — p*) dk ( da ) 
dk jë 
—<— SE 
g(p*) 


_ (laf (pt? 
~ (fdk z E, Ep 


The inverted modulus term is obtained as follows: 


daf k k è = k k 
de /ma th ymt Ea M 
sa e 
Eata 
SO 
df _1 Ea 
dk p* E p* Ea + E 


We then have 


1 EaEy (>) J 2 
= |? dO 
32n2M p*(Ea + Eo) Ea Ep Mal 


Vs; 


By energy conservation, Ea + Ey = M, so, finally, 


p“ 2 
a a T 2 aa, 2, 
f 3272 M2 J |Me | ( 38) 


where the angular integral is over the solid angle of particle a. 
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Try Exercise 2.5 at the end of the 
chapter for an alternative derivation of 
these results. 


16The Dalitz plot is named after 
R. H. Dalitz, who invented it for the 
study of K? — 3m decays. 


Two-body decay kinematics 


First, we calculate p* from the original condition on the 6-function, 


f(k) = 0: 


M= Vm + pt? + Vm? +p. 
Some straightforward algebra yields!° 


p= WT J (M2 (ma + mp)2|[M2 — (ma — m)?]. (2.39) 


Then, using E? = p? + m?, the energies Ea and F, are given by 


M2 2 2 M? 2_ 2 
aau a a (2.40) 


Ea ; 
2M 2M 


Note that energy and momentum conservation fix both the momenta and 
energies of the particles in a two-body decay. This is most easily seen 
in the rest frame of the decaying particle (or equivalently in the CMS 
frame of the decay products)—the only freedom is an azimuthal rotation 
about the common axis of the momenta of the two decay products. 


Three-body decays 


For a three-body decay, there are no longer enough constraints from 
4-momentum conservation to determine the energies of the decay prod- 
ucts, even in the rest frame of the parent particle. However, there are 
limitations that can be understood most easily by considering one decay 
product at rest with the other two particles then sharing the available en- 
ergy as in a two-body decay. The most elegant approach to three-body 
decays is the Dalitz plot—a two-dimensional plot of either two decay 
particle energies or two decay particle invariant mass pairs.!° We shall 
use the latter, since the result is then manifestly Lorentz-invariant. 


e Write the Lorentz sum of mı and m2, M12, explicitly in terms of 
Ez and ps: 


M?, = (E1 + Ee)? — (pi + po)? 
= (M — E’ — p3 


e Identify m3: 


M?, = M? —- 2ME; + E} — p3 
= M? + m? — 2M Es (2.41) 


e Differentiate with respect to Es: 


d(M?,) x dE3 (2.42) 


e By similar reasoning, 
d(M33) x dEy 


Next, consider the infinitesimal three-body phase-space element dps, 
ignoring constant factors: 


e As in the two-body case, one of the integrals over d?p; is removed 
using the 3-momentum 6°(-) function: 
d°p, d°py 
E E2E3 


dp3 x 6 (Ey t Ez H E3 M) 


e Change to spherical coordinates: 
d?p; dps = dp; pı d0ı pı sin A; dey dp2 p2 dO2 p2 sin 02 dbz 
e Redefine the solid angle: 


dQ, dQ2 = sin 0i dé, sin A də dd, doz 
= sin 0i dé, sin O12 dĝı2 dg, doz 


Then 
d’pid?p2 = p? p2dpıdpə sin 6, d0, sin 0,2d0,2d¢,do2 


Simplify, noting that one part of the integration is trivial: 


fèr d°p, = 8r? [ae dp, dp2 sin O12 dĝı2 


e Consider the momentum-squared of the back-to-back systems, p3 
versus pı + Po: 


p3 = (Pi + p2)” = Pi + p3 + 2p1p2 cos 12 
e For a given |p,| and |pə|, ps depends only on 812: 
2p3 dp3 = —2pip2 sin 012 d012 


sin O12 di2 x ae dp3 
Pip2 


Put this result into dp3, giving 


1 p3 
d ———— p?p? dp, dpo —— dp3 ô(Eı + E2 + E3 — M 
p3 X AAAA Pı P2 Sipa p3 ô(E1 + Fy + Es ) 
PıP2P3 
= dp, dpə dp3 (E E E3 — M 
E, bE pı Ap dp3 ô(Eı + By + E3 ) 


e Next, change variables, p; dp; = E; dE; and remove one term using 
the 6-function: 


dps x dE, JEZ dE; 5(E, + Es +Ex— M) 
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Fig. 2.3 A Dalitz plot [114]. mĵ2 (GeV~) 


e Finally, from eqn 2.42, d(M?,) « dE; and d(M33) « dE}: 


dps o d(M?,) d(M3,) (2.43) 


Figure 2.3 shows a Dalitz plot. The importance of the Dalitz plot is con- 
tained in this result—the phase space available in a three-body decay 
is uniform across the plot, here for M7, versus M33. This means that 
dynamical structure, for example a resonance in one invariant mass pair, 
shows up as a region of higher or lower density in the Dalitz plot. These 
regions can then be seen clearly as peaks and troughs in appropriate one- 
dimensional projections. An example from the BaBaR experiment [41] 
at the PEPII ete” storage ring is shown in Fig. 2.4. The Dalitz plot for 


n?(x*n7) (GeV/c*) 


Fig. 2.4 Dy 7 ntn- nt Dalitz plot i oS S A i l ss Os E | Pook oe ee j E Pa (aes Ltt 
from the BaBaR experiment [41]. Res- 0 0.5 1 15 2 2.5 3 3.5 
onant bands in the two rtr invariant ee a) 4 

mass distributions can be seen clearly. m(x" x) (GeV/c) 


Dł > n*m r" is presented in terms of two 7-7 invariant mass combin- 
ations. The event distribution is clearly non-uniform, with two narrow 
m7 resonant bands at invariant masses corresponding to the fo(980) 
state. What might cause the accumulation of events around 1.9 GeV? on 
the diagonal? 


2.5.3 Cross section 


To calculate the total cross section for a collider process a +b > X, we 
start from Fermi’s golden rule: 


flux x o(ab > X) = wei X [final-state phase space] 


where we; = |.$¢;|7/VT is the transition rate and VT is the total space- 
time volume. In a colliding-beam experiment, the initial-state particle 
flux is 2E,2Ey|va — v|, where v; are the particle velocities.!” 

Inserting the expression from eqn 2.30 for Sti in terms of the reduced 
matrix element into the above equation gives rise to the square of the 
overall 4-momentum conservation delta function. Formally, this is han- 
dled in the same way as for the decay rate calculation (eqn 2.34). The 
identity 


(2m8 pr- pi) = f d=” ate 


is used to replace one of the 64(-); then performing the integration with 
pe = pei on account of the other 5*(-) gives f dtz = VT. The VT factors 
then cancel to give 


1 
7 2Ea2 E| Va = vol] 


a(ab + X) J atips(s : X)|M(X : ab)/? 


where dLips is the Lorentz-invariant phase space, which for a final state 
with nx particles is 


N dĉk; 
dLips(s : X) = (27)*8 (px Pa Pb) II (27)32k? 


i=l 


where s = (pa + pp)” and px = >>,k;. For a total cross section to all 


final states X, an additional ` y is performed. For the unpolarized cross 
section for particles with spin, final spin states are summed and initial 
states averaged. This gives an additional term, 1/[(2.5,+1)(2S,+1)], on 
the right-hand side, where Sa and Sẹ are the spins of particles a and b. 
If a differential cross section is required, then the relevant variables are 
excluded from the phase-space integral. 
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17 The energy factors in the flux defin- 
ition are a consequence of the Lorentz- 
invariant normalization. 
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18fxercise 2.9 covers this calculation. 


Flux factor 


The flux factor 2Ea2Eb|Va— vo] is a Lorentz invariant and may be written 
in a number of ways. In the fixed-target frame, with b at rest, it is 
2E,2mp|Vva|. The following relations also hold: 


2E2Fy|Va T vol = 4V (pı < p2)” _ mim; =2 \/ A(s,m2,m?) 


where 


A(s, mas mp) = (5 — ma — mp)” — Amami 


Two-body scattering 


Consider the special case of two-body scattering a +b > a’ +b’ with 
4-momenta pag = P1, Pb = P2, Pa’ = P3, Py’ = pa. The cross section in 
the fixed-target frame (particle b at rest) is given by 


IM? 


do = ————_—_. 
7 2E12mMə|v1| 


dLips(s; p3, p4) (2.44) 


where in this case 


d°ps dp, 
2E3(27)3 2E4(27)3 


dLips(s; p3, p4) = (27)*54(p3 + pa — pi — p2) 


with s = (p3 + p4)? = (pi + p2)?. dLips may be evaluated by using the 
6*(-) to integrate over four of the six variables dp, d*p,, and doing this 
in the CMS frame gives!® 


do 1 


2P 
= 2.4 
dQ  (8rw)? a p ee) 


where w = vys is the total CMS energy and p and p’ are the magnitudes 
of the initial and final CMS 3-momenta, respectively. The above cross 
section has dimensions of [energy]~?. To get back to physical units of 
area, one must multiply by (fic)? = 0.389 mb GeV?. 


2.5.4 Breit-Wigner 


In nuclear physics, r might be anywhere from 1073 to 10+%*s but par- 
ticle physics typically deals with short timescales: 10723-1078 s. As we 
have seen above, the lifetime of an unstable state is inversely related to 
the total decay rate I (Tr ~ 1). This is a form of the time—energy uncer- 
tainty relation: the uncertainty in lifetime translates to an uncertainty 
in mass. The state is said to have a natural width 


T=1/r and N(t)xe Tl (2.46) 


For long-lived particles (meaning ~10~!8 s or more), the natural width 
is so small that it is better to quote mass and lifetime. 


A simple model for such states gives rise to the Breit—Wigner line 
shape (Fig. 2.5). We proceed as follows: 


e The wavefunction of a state of energy Fr and lifetime 7 = 1/T is 
H(t) = P(O)ePRtert/?r 
Z (Oe *GFR+T/2) 


e The intensity ~* follows the exponential decay law I x e~!*. The 
amplitude as a function of E is derived from the Fourier transform: 


x(E) = J Ytje” at 


SO I e—tlT/2)+i(Er-E)] gg 


1 
* TE- Er) i 


e The cross section o( E) « yx": 


2/4 
(E — Ep)? +1?/4 


o(E) = max (2.47) 

The value of Cmax can be found using a heuristic argument from wave 
optics following Perkins [117]. The angular momentum of a particle with 
momentum p about a scattering centre may be written as L = pb, 
where b is the impact parameter.!? Particles in a beam of fixed mo- 
mentum will have b in the range (0, L/p) and will impact on the target 
plane within a circular disc of area 7b?. The wavefunction of the inci- 
dent particles can be expanded as a sum of partial waves with quantized 
angular momentum L = l, where l is an integer. The fraction of the 
beam with L € (l,l + 1) hits the target plane within an annulus of area 
nil + 1)? — 1?]/p? = (21+ 1)x(1/p)?. To take account of elastic and to- 
tally absorptive scattering, the elastic scattering amplitude is doubled, 
leading to a factor of 4 in this expression.?° For scattering through the 
lth partial wave, the result is Omax = 47 (1/p)? (21 + 1). 

A few more details are needed: 


e So far, the spin of the particles has been ignored. Since the Breit— 
Wigner formula is a total cross section, we sum over the final-state 
spins and average over the initial-state spins, giving a factor 


2J+1 
(2Sa + 1)(2S5p + 1) 


For scattering through a resonant state with spin J, l > J. 


e The expression in eqn 2.47 for the energy dependence is not in 
a Lorentz-invariant form. This is rectified by multiplying it by 
(E + Er)? and using the approximation E ~ Eg = Mo where 
appropriate. 


2.5 Transitions and observables 41 


04 a SAL 
E 


Fig. 2.5 Breit—Wigner line shape. 


19 The impact parameter is defined as 
the perpendicular distance from the 
scattering centre to the direction of 
travel of the object. 


20 For more details on partial wave ana- 
lysis, see the Further Reading at the 
end of the chapter. 
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Fig. 2.6 Breit—Wigner line shapes for 
$ — K+K~—, from the LHCb experi- 
ment at the LHC [100]. The Breit- 
Wigner line shape is superimposed on 
a gradually rising background shown by 
the dashed line. 


21 The full width at half the maximum 
height of the peak (FWHM). 
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e Often, a resonance will occur in many different scattering processes 
(or channels), and to allow for this, the expression in eqn 2.47 is 


further multiplied by the branching ratios BRin and BRout for the 
entrance and exit channels. 


The result is 


MeT? 
= Omax 2.48 
o(s) =a (s— MR)? $ MeT? ( ) 
where 


p? (25a + 1)(2S, + 1) 


Two examples of Breit-Wigner line shapes are shown in Figs. 2.6 and 2.7, 
for the decays ¢ > K+K- and K* => Kta~, respectively. Both states 
have comparable masses (¢ ~ 1020 MeV, K* ~ 900 MeV), but the K* 
width?! is about 50MeV, while that of the ¢ is only 4.3MeV. Try 
Exercise 2.8 to help understand this. 


2.6 Luminosity and event rates 


In Section 2.5.3, a relationship among event rate, flux, and cross section 
was introduced and elaborated, particularly for two-body scattering. The 
strategies for optimizing and measuring the luminosity in colliders are 
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described in Section 3.4. The critical issue of ‘triggering’ (i.e. selecting 
the interesting events to keep) is introduced in Section 4.10 and discussed 
in slightly more detail for the LHC in Section 13.2. 


2.7 Group theory 


This section provides a brief introduction to group theory, with emphasis 
on the groups of relevance to physics and particularly particle physics. 
The mathematical definition of a group G considers a set of objects 
(a,b,c,...) with a multiplication rule satisfying the following: 


for objects a,b € G, the product ab is also a member of G; 


e it is associative: a(bc) = (ab)c for (a,b,c) € G; 


G contains a unit element e such that ae = ea = a for all a € G; 
e for all a € G, there exists an inverse a! € G satisfying aa~! = 


a`la =e. 


There are many different types of group of relevance to physics: 


Sn the group of permutations of n objects (n! elements), which involves 
discrete operations; 

T> translations of two-dimensional vectors in a plane, Ta : x > x’ = 
x +a; 

Rs rotations of three-dimensional vectors, x; = Rijxæj. R3 conserves 
distance, which requires: det(R) = 1 and De Aix fay, = ĉip (R is a 
proper orthogonal matrix); 

U(n), SU(n) the groups of unitary and unimodular n x n matrices U 
satisfying UUt = U'U = I„, with det(U) = e'® for U(n) and det(U) = 
+1 for SU(n) . 
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22 For T2, a two-dimensional vector 
translation in a plane requires the dis- 
tance that the centre of the vector is 
moved in the plane and the angle of 
rotation (also within the plane) with 
respect to the original direction of the 
line. For Rg, a rotation in three dimen- 
sions requires the direction of the axis 
about which the rotation takes place 
(two direction cosines) and the angle of 
rotation about this axis. 


23 Also known as the basis of the Lie 
algebra; n is the order of the group. 


24That is, UUt =UtU =I. 


T> and R3 depend on two and three parameters, respectively.?? These 
numbers are known as the order of the group. For Tə, the sequence in 
which two successive translations is applied does not affect the outcome: 
T>(a)To(a’) = To(a’)T>(a); such a group is called Abelian. The groups 
R3 and SU(n) are non-Abelian, so the sequence of successive operations 
matters. 


2.7.1 Lie groups 


Consider a matrix group {A}. An element A must have an inverse, and 
hence det(A) 4 0; thus, there exists a matrix a such that 


A=exp(a)=1l+a+a?/2+... 


and a is the logarithm of A. The set of all matrices a whose exponentials 
belong to a group G is known as the Lie algebra of G. Using the definition 
of the exponential series, we have 


A= lim (1 + T and inverse a= lim k(A‘/* — J) 
k—0o k k= oo 

For large k, the matrix I + a/k is an operator of the group and gives 

A by iteration. It is an infinitesimal operator, since for large k it differs 

from the identity operator by an infinitesimal amount. 

The product of two elements of a group is a member of the group, 
which for a finite group must then be expressible as a linear combination 
of the n generators of the Lie algebra.?? This provides a set of relations 
among the commutators of the basis elements {gi}, i =1,...,n: 


lgi, gj] = ick. on (2.50) 


There will be n(n — 1)/2 such relations, and the ch are known as the 
structure constants of the group. 


2.7.2 U(n) and SU(n) 


The Lie groups U(n) and SU(n) occur in a number of different contexts 
in particle physics. U(1) and SU(2) are used in the construction of the 
electroweak sector of the standard model. SU(2) and SU(3) occur as 
approximate symmetries used to classify nuclear and particle states. The 
strong force of QCD is based on an exact SU(3) of eight gluons providing 
the strong force binding the quarks into hadrons. 


U(n) The set of all nxn unitary?* matrices form the U(n) group. Because 
UUt = I, det(UU') = 1, and so we have 
det(UU"') = det(U) det(U") 
= (det U)(det U)* 
= |det U|? =1 
hence det U =e'® 


SU(n) This is the subset of U(n) for which ¢ = 0, ie. det(U) = 1. 
Identifying U = e'9, where g is the generator of the operator U, and 
using the relation 


det (e$t) = ei TW) 


it follows that e'% is an element of SU(n) if, and only if, g is traceless: 
Tr(g) = 0 > oP) =1, 


Once the relevant group has been identified, properties of a model can 
be inferred from the properties of that group: 


e for SU(n), the number of independent generators is n? — 1; 
e so for SU(2) this is 2? — 1 = 3, and for SU(3) it is = 3? — 1 = 8. 


2.7.3 SU(2) 


The simplest SU(n) group is SU(2). It is the group of 2 x 2 unitary 
matrices U with det(U) = 1. Writing U = e'”, unitarity gives 


UUt = ehe th, 


From the unitarity condition, (UU')t = UŻU = I, so U and Ut commute 
and it follows that h and ht also commute. Under these conditions, 


ee ih! e h') als e? 


so h — ht = 0. These are the conditions for h to be a Hermitian matrix. 


SU(2) has three generators, and the three 2 x 2 traceless Hermitian 
matrices making up the Lie algebra are the Pauli matrices: 


a e a ea 


Any 2 x 2 traceless Hermitian matrix h can be written as a linear 
combination of the three Pauli matrices with real parameters A;: 


h = \101 + À202 + A303 (2.52) 
The commutation relations of the Pauli matrices are 
[o1, 02] = 2io3, [72,03] = 2i01, [03,01] = 2i09 (2.53) 


So for this group the structure constants are all 2. The angular mo- 
mentum vector of a spin-5 particle is J; = $i, with the usual angular 
momentum commutation relations [J;, Jj] = ie” Jp. 
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The Pauli matrices provide the fundamental or irreducible matrix rep- 
resentation of the group SU(2). From these, representations of higher 
angular momentum states can be constructed—see the Further Reading. 


2.7.4 Combining states 


25 A familiar example is how to combine Representation theory provides the rules for how to combine states.?° 


angular momentum states (representa- Some examples of the arithmetic for combining SU(n) states are as 
tions of the three-dimensional rotation follows: 


group R3). 
e SU(2): 
2@2=4=361 
(3 symmetric states, 1 antisymmetric state); 
2@2@2=8=46262 


(4 symmetric states, 2 mixed-symmetric and 


2 mixed-antisymmetric). 
e SU(3): 
38383=27=1088e8el 


(10 symmetric states, 8 mixed-symmetric, 8 mixed-antisymmetric 
and one antisymmetric state). 


Further details of SU(2) and SU(8) and their use in particle physics 
are given in Chapter 5. See also the exercises at the end of this chapter. 


Chapter summary 


e Symmetries and invariance, addition of angular momentum. 
e Clebsch—Gordan coefficients and branching ratios. 

e Lorentz-invariant variables, cross section, and decay rate. 

e Two- and three-body kinematics, Dalitz plots. 

e Unstable states and Breit-Wigner resonance, event rates. 

e Group theory, the groups U(1), SU(2), SU(3). 


Further reading 


e Gasiorowicz, S. (1974). Quantum Physics. Wiley. e Halzen, F. and Martin, A. (1974). Quarks and Lep- 


e Muirhead, H. (1974). The Special Theory of Relativity. tons. Wiley. Gives a good description of partial 
Macmillan. waves. 
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Exercises 


(2.1) 


What is the minimum beam energy needed to 
produce a W boson in a fixed-target collision? 
Compare this value with the equivalent beam 
energy of a symmetric pp collider. 


Consider the production of neutrinos from decays 
of m~. The m~ are produced by collisions of pro- 
tons of energy E on a stationary target. Estimate 
the minimum energy required to produce neutrinos 
of energy 10 GeV. 

0 


The p° decays strongly to rtr”, but not to 7°r°. 
Why? 

Starting from eqn 2.15, determine the elements of 
the rotation matrix for spin-4 particles. 


Consider two-body decay A > B+ C. In the 
rest frame of A, use energy conservation (M4 = 
Eg + Ec) and 3-momentum conservation (decay 
products B and C must travel in opposite dir- 
ections) to give an alternative derivation of the 
expressions for Æa and Fg in eqn. 2.40. Finally, 
derive eqn. 2.39 for p* (magnitude of B or C 
3-momentum). 


(2.6) 


(2.10) 


(2.11) 


Starting from the definitions of s and t, eqn. 2.23, 
derive eqns 2.24 and 2.25 for the case of mp 
scattering in a fixed-target geometry. 


Referring to Fig. 2.4, the Dalitz plot for the decay 
Dy => ntn r", can you explain the accumula- 
tion of events around a mass-squared of 1.9 GeV? 
on the diagonal? The PDG data tables may be 
useful. 


Why does the ¢(1020) have such a small width and 
what might this imply about its quark content? 


Work through the calculation to get from eqn 2.44 
to the expression in eqn 2.45 for da/dQ. Think 
carefully about how to use the overall 64(-) con- 
straint from 4-momentum conservation. 


Using the definitions of U(n) and SU(n), show 
that the unitary matrices used to represent SU(n) 
require n? — 1 real numbers. 


Show that for a Lie group of order n, there will 
be n(n — 1)/2 commutation relations defining the 
structure constants. 
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lAn essential step in electronic chip 
fabrication. 


Wall 
current 


Fig. 3.1 A pill-box cavity. 


Accelerators 


Accelerators are devices that accelerate charged particles to a broad 
range of energies from keV to TeV. This chapter is a very brief introduc- 
tion to accelerators in particle physics and high-energy nuclear physics. 
There are only a few accelerators used for particle or nuclear physics, 
but about 30000 accelerators are currently used worldwide for other 
very important applications. There are about 9000 accelerators used in 
cancer therapy, 9500 in ion implantation,! 4500 for cutting and welding, 
2000 for electron-beam and X-ray sources, 1000 for neutron generators, 
and more in other fields. Each type of accelerator is built using the 
technology most suitable for its particular application. In this very brief 
introduction, we will focus only on synchrotrons and linear accelerators, 
since these are the typical choices in particle physics. Some excellent 
introductory textbooks on accelerator physics are given in the Further 
Reading at the end of this chapter. 


3.1 Radiofrequency acceleration 


3.1.1 Electric and magnetic fields 


Particles are accelerated by the electric field only, 
E = -VY - — (3.1) 


where ọ is the scalar potential and A the vector potential. In particle 
physics accelerators, the time-dependent vector potential A is the source 
of the accelerating field E. The simplest realization is a cylindrical struc- 
ture, sketched in Fig. 3.1, and is called a pill-box cavity. Microwave 
radiation (with frequency in the MHz-GHz range) produced in a device 
called a klystron (see Section 3.1.6) is guided to the pill-box cavity, where 
it forms a standing wave; i.e. the pill-box cavity acts as a resonator. In 
free space, electromagnetic waves can only have transverse electric and 
magnetic fields with respect to the direction of propagation; however, in- 
side a cavity, we can have transverse electric (TE) or transverse magnetic 
(TM) modes, indicating fields that have either longitudinal magnetic or 
electric fields, respectively. Since we need a longitudinal electric field 
to accelerate charged particles, only the TM modes will be useful. The 
modes can be found by solving Maxwell’s equations in free space with- 
out free charges or currents, subject to the usual boundary conditions at 
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the conducting surfaces of the cavity. These conditions ensure that the 
longitudinal component of the E field and the perpendicular component 
of the B field vanish at the surface of a conductor. 

The useful (i.e. accelerating) modes of this cavity are TMm, where 
the indices, n, l, and m refer to the field variations along the usual polar 
coordinates ¢@ (azimuthal), r (radial), and s (longitudinal, i.e. along 
the beam direction). Since the radial variations are given by Bessel 
functions and we require a non-zero component of the electric field in 
the longitudinal direction s on-axis (i.e. E,(r = 0) # 0), and we must 
satisfy the boundary conditions at the surface of the conductor, we 
need the | = 1 modes. We want to select the modes that minimize the 
energy stored (and hence the electricity costs) for a given accelerating 
gradient. This means that we use the TMoi9 mode. This mode has only 
two components, the electric field Æ, in the direction of the acceleration 
(s) and the azimuthal component of the magnetic field By in the cavity 
(as indicated in Fig. 3.1), which are oscillating with the radiofrequency 
(RF) frequency w: 


Es ~ Jo(krje*, Bg ~ S(kr)el* (3.2) 


where Jo and J; are the lowest-order Bessel functions and k = 27/A 
is the wavenumber. The requirement that Æs(R) = 0, where R is the 
radius of the pill box, determines the allowed value of k from the 
zeros of the Bessel function. The first zero of the Jo Bessel function is 
Jo (2.40) = 0, therefore kR = 2.40 and hence A = (27 R)/2.4. So for the 
l = 1 mode, à œ 2.62R. The amplitudes Jọ and Jı depend on the radial 
coordinate r as sketched in Fig. 3.2. 


3.1.2 Circular accelerators and synchronicity 


Suppose now that we want to accelerate protons in this pill-box cavity 
along the direction s, which goes from left to right. We need to inject 
these protons into the cavity when the electric field component E, can 
accelerate them, i.e. when it is positive. After a time t = m/w, the field 
changes sign and injected protons would be decelerated. If we further 
assume that the pill-box cavity is an accelerating structure of a synchro- 
tron where protons go around on a closed trajectory, approximately a 
circle,? then we will end up with the synchronicity condition 


w = N Wev: (3.3) 


The RF frequency needs to be an integer (N) multiple of the revolution 
frequency wreyv with which protons go around in the synchrotron. This 
is demonstrated in the cartoon in Fig. 3.3. The value of N is chosen for 
practical reasons—for example to make the RF frequency lie in the range 
where components such as amplifiers are available. This means that N is 
typically very large (e.g. at LEP, N = 31320). This defines the number 
of ‘buckets’ in which we can potentially store stable beams. However, 
we usually only want to inject particles into a much smaller number of 
buckets®. In a synchrotron, protons go through the accelerating cavities 
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Fig. 3.2 Amplitudes of the TMo10 
fields in a pill-box cavity. 


Fig. 3.3 The child accelerates the 
roundabout by only pushing at the 
correct phase. 


2 There will be straight sections, for ex- 
ample in the experimental halls where 
particle physics detectors are located. 


3The groups of particles in filled buck- 
ets are called bunches. At the LHC, the 
spacing between buckets is 2.5 ns, but 
only ~ 10% of buckets are filled with 
protons. 
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Fig. 3.4 An accelerating cavity 
consisting of two pill-box cavities. A 
proton bunch in the left-hand gap will 
be accelerated. A proton bunch in the 
middle drift tube will be shielded from 
the decelerating field. When it emerges 
into the right-hand gap, the field has 
changed sign. 


4¥For simplicity, we assume that the 
paths around the accelerator are of the 
same length for all particles considered. 
In reality, different momenta/energies 
lead to different paths, which also need 
to be taken into account. 


>For electrons, we would need to take 
account of energy losses due to synchro- 
tron radiation and differences among 
trajectories of particles with different 
momenta. The conclusion would be 
that Ni and not Mı would correspond 
to a stable reference trajectory for a 
bunch of electrons. 


Fig. 3.5 Energy gain in a cavity as a 
function of the arrival time or relative 
phase of a beam particle with respect 
to the oscillating electric field in the 
cavity. Eo is the accelerating electric 
field amplitude, L the cavity length, 
and e the charge of accelerated par- 
ticle. Other symbols are explained in 
the text. 


many times, gaining energy at every passage. The magnetic field guiding 
them through the accelerator (see Section 3.2) changes along with the 
acceleration, keeping them on the same orbit. 

One can combine many pill-box cavities, stacking them one after 
another as sketched in Fig. 3.4. Protons are then accelerated in gaps 
between the drift tubes (see Fig. 3.4) when the accelerating electric 
field points in the right direction and then ‘hide’ inside the drift tubes, 
isolated from the electric field when it points in the wrong direction, 
emerging in the next pill box when the electric field is again pointing in 
the right direction. 

The fact that beam particles need to enter an accelerating cavity at 
the right time leads to the bunched structure of the beam. This is dem- 
onstrated in Fig. 3.5 in more detail. A proton, for example, with nominal 
momentum p and travelling along the nominal path is called a synchron- 
ous particle and its trajectory is the reference trajectory. It arrives at the 
entrance to the cavity at point Mı. After going through the cavity, its 
energy is increased by £e (here we are assuming that wL/c < 1). Another 
proton arrives a little earlier at point P and its energy is increased by a 
smaller amount than £, so when it enters the cavity again, after the next 
revolution, it will not be that early in comparison with the synchronous 
particle.* Yet another proton arrives a little late at point P’. This proton 
will gain more energy than £, so, after the next revolution, it will not be 
that late. In a synchrotron in which we combine the accelerating cavities 
with magnetic fields, these different energy gains and losses will lead to 
oscillations (‘synchrotron oscillations’). 

In contrast, considering points Q and Q’ in relation to point N1, one 
can see that a proton arriving early with respect to Nı will gain more 
energy and one arriving late will gain less energy than the proton arriving 
at Nj ; thus, every revolution, these protons will be further and further 
apart from each other, eventually escaping from the accelerator.” So Mı 
and Mg are stable points where bunches of beam particles can be located 
and N; and Nə are unstable points. 

This simplified discussion of phase stability needs to be expanded to 
take into account competing changes: the speed of the particles and the 


radius of the orbit. The frequency is simply related to the speed v and 
the radius R by 


VU 


= oR 


(3.4) 
Both v and R depend on the momentum of the particle. From relativistic 
kinematics, we know that 


_ mw 
d Vv1—v? 


When the RF acceleration increases the momentum of the particle, it 
will also cause it to follow a slightly different orbit. This change AR in 
the radius is defined by the dispersion D, which for a certain change Ap 
in the momentum gives® 


(3.5) 


AR= p% (3.6) 


It is conventional to define the ‘slip factor’, which is given by the frac- 
tional rate of change with frequency divided by the fractional change in 
momentum: 


Af/f 


NRF = Ap/p (3.7) 


Substituting from eqns 3.5 and 3.6, we can evaluate (see Exercise 3.2) 
the two terms in eqn 3.7: 


1 D 


a= 3.8 
2 Ry (3.8) 


TRE = 
where Ro is the radius of the reference trajectory. Therefore, for injection 
at low momentum, for which y ~ 1, for a typical proton synchrotron, 
nrF > 0. However, as the momentum increases, we will reach a transition 
in which nrf = 0 and, for higher momentum, 7RF < 0. This implies that 
the region of phase stability flips when we cross the transition defined by 

1 D 


= 3.9 
(Ytransition)? Ro ( ) 


Therefore, we need to change the RF phase as we cross such transitions.” 


The concept of phase stability discussed here is one of the key ideas 
that enabled the successful operation of high-energy accelerators. 


3.1.3 Accelerating-cavity design and Q factor 


For efficient operation of an accelerating cavity, i.e. a standing-wave 
resonator, one requires that the energy be transferred from the resonator 
to the accelerated beam and not dissipated into the environment by 
losses to the walls of the cavity and by radiation to the environment. 
This means that the Q value, i.e. the quality factor defined as the ratio 
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6The value of the dispersion depends 
on the type and strength of the mag- 
netic focusing. See Wilson in Further 
Reading for details. 


TThis can be done sufficiently quickly 
that the particle losses are negligible. 


8In this subsection we will work in SI 
units 
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°This is a simplified calculation that 
neglects radiation losses. 


10 For superconducting cavities, we 
have to consider the effective surface 
resistance, so there are still losses; how- 
ever these are orders of magnitude 
smaller for superconducting nickel com- 
pared with copper, which suffers from 
Ohmic losses. 


RF power 


Beam —~> 


Fig. 3.6 Schematic cross-section of an 
RF cavity (not to scale). Note the 
small accelerating gap and the 
relatively large volume. The RF power 
enters through an insulating ceramic 
window and couples into the cavity. 


of the average energy stored in the cavity, U, to the average energy 
dissipated in one oscillation period, Ug, is very large. The energy stored 
alternates between the electric and magnetic fields, but we can calculate 
this from the peak value of the magnetic field, Ho: 


= Ho f Hoj? ae (3.10) 


We can calculate Ug from the Ohmic losses at the surface of the cavity.’ 
For a good conductor (o/ew >> 1), we can neglect the displacement 
current. Using Ampere’s law, we can show that | H| = jsurr, where jourt 
is the surface current per unit length. The power dissipated, I?R, can 
be evaluated as a surface integral 


1 
P= 5 | [Hlol?Raner dS (3.11) 


where the surface resistance Rgyp=1/od and the skin depth 
= ,\/2/pow. The energy dissipated over one period T = 27/w is then 


given by 
= H 2 
AW = ny) fH dS (3.12) 


Comparing eqns 3.10 and 3.12, we can see that to maximize the Q value, 
we need a high frequency, a high conductivity, and a large ratio of volume 
to surface area. The RF frequency used is usually in the range 100 MHz 
to ~10 GHz.!° 

Next we need to consider the fact that the electric field is varying 
while the particle crosses the cavity. The electric field for the n = 0 
mode on axis (r = 0) depends on time as 


E; = Eo cos(wt) (3.13) 


For an ultrarelativistic particle crossing the cavity, the position along 
the axis of the cavity is simply s = ct and its speed does not change, 
but it gains energy over the length of the gap, G: 


G/2 9 
AW = I motoks oe (3.14) 
G/2 wG /2c 


It is clear from eqn 3.14 that for efficient acceleration, we need the gap 
length to be significantly less than the wavelength. This helps keep the 
bunches in the accelerating phase and prevents slippage into the decel- 
erating phase. On the other hand, we have seen that to obtain a high 
Q value (and hence minimize Ohmic losses), we need a large ratio of 
volume to surface area. The RF cavities that are used in high-energy ac- 
celerators have shapes optimized to meet these two requirements, which 
include a short gap length and a large volume. A (very schematic) sketch 
of a cross section of a cavity is shown in Fig. 3.6. Typically, many cavities 
are combined in one structure. 


Good-quality accelerating cavities that can be produced in large quan- 
tities have accelerating fields in the region of 20-30 MV m™!t. Single 
cavities might achieve up to 100 MV m~t, which is a breakdown limit (or 
beyond) for most materials.‘! To achieve larger accelerating fields, one 
needs a different approach. Fields up to 100 GV m7! can be obtained in 
plasma (no walls to break down), where electrons can be displaced from 
quasistationary ions. This is a very active research area that eventually 
might lead to a new generation of accelerators. 


3.1.4 Synchrotron radiation energy loss 


When charged particles are accelerated in a circular machine, they lose 
energy by synchrotron radiation. For an ultrarelativistic particle of mass 
m and Lorentz factor y, in an orbit with radius of curvature p, the power 
emitted in the form of synchrotron radiation is 


_ 2 remy* 


P= 
3. op? 


(3.15) 


where re = e?/(47€9mc?) is the classical radius of the particle. As the 
synchrotron radiation scales as 7f, it will generally be negligible for pro- 
tons, but the losses for electron machines will be very significant. The 
energy loss grows as the fourth power of the energy, and therefore there 
is a limit to the energy reach of circular electron machines. Although 
synchrotron radiation can be reduced by increasing the radius of the 
machine, this becomes prohibitively expensive at high energies. LEP is 
generally considered to be the highest-energy circular electron acceler- 
ator that will be built: higher-energy electron—positron machines will 
be linear colliders, for which the synchrotron radiation is negligible.!? 
While synchrotron radiation is a problem for particle physics applica- 
tions, it turns out to have many uses in other areas of science. The 
synchrotron radiation in the laboratory frame is forward-peaked around 
the electron direction and provides a very high-brightness X-ray source. 
Dedicated electron rings are built with ‘wiggler’ magnets to increase the 
synchrotron radiation. The X-rays are used in condensed matter physics, 
biology, medical applications, and other fields. 


3.1.5 Linear accelerators 


In a linear accelerator, one can use standing-wave cavities as described 
above or a travelling electromagnetic wave to accelerate electrons. Of 
course, the travelling wave needs to be propagating in a waveguide-like 
structure in order for the electric field to have a component along the 
direction of travel. Then one can inject electrons to sit on the crest 
of that travelling wave and gain energy as indicated in the cartoon in 
Fig. 3.7. A schematic sketch of an accelerating structure is shown in 
Fig. 3.8, where 2a is the diameter containing the beam and 2b is the 
outer diameter of the waveguide. 
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ll The breakdown mechanisms are dif- 
ferent for conducting and supercon- 
ducting cavities. For superconducting 
cavities, the hard limit is set by the fact 
that if the magnetic field at the surface 
becomes too large, the superconductor 
will return to the normal resistive state 
(‘quench’). In practice, no useful super- 
conducting cavities have been made 
with gradients above 50 MV m7}. 


12 There is currently some discussion 
about ideas for very large circular col- 
liders, including the ete~ option. If 
such a machine is ever built, it will 
certainly be very expensive! 


Fig. 3.7 The principle behind 
travelling-wave acceleration. 
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Fig. 3.8 Disk-loaded accelerating 
structure. 
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Fig. 3.9 Travelling-wave accelerating 
structure. 
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There are two approaches to particle acceleration. One is based on the 
use of cavities with short accelerating gaps (see eqn 3.14). An alternative 
approach uses a waveguide structure in which we have a travelling wave. 
However, in a smooth waveguide, the phase velocity is always larger than 
c and therefore cannot be used for particle acceleration. One approach 
to this problem is to insert discs inside the structure that are used to 
adjust the phase velocity of the travelling wave. The radii a and b and 
the distances between discs are chosen such that the phase velocity of 
the wave equals the electron velocity. They need to be changing along 
the structure as electrons are being accelerated. But once the electron 
speed becomes very close to the speed of light, there is no need to change 
the geometry of the structure. A more realistic sketch of an accelerating 
structure is shown in Fig. 3.9. An RF wave produced by a klystron enters 
and leaves each cavity to be absorbed outside the cavity. If instead of 
being absorbed the wave is reflected at the end of the cavity, a standing 
wave will be created that could also be used to accelerate electrons. 


3.1.6 Klystrons 


This section gives a very brief and simplified idea of how klystrons work. 
A DC high voltage is first used to accelerate a continuous electron beam. 
The electron beam enters an RF cavity to which RF power is delivered 
at a resonant frequency. This causes the velocity of the electron beam 
to become modulated. The electrons enter a drift region in which the 
velocity modulation is translated into spatial modulation (bunching). 
Finally, the electron bunches enter another RF cavity called the ‘catcher 
region’. They enter out of phase with the RF, so they are decelerated 
and their kinetic energy is converted into RF energy. The RF wave is 
then guided by a waveguide to the accelerating RF structure. 


3.2 Beam optics 


3.2.1 Magnetic lenses 


To guide and focus beam particles along the reference trajectory (which 
may not be a straight line), one needs magnets. In a synchrotron, 
for example, which has a circular geometry, one needs dipole magnets 


providing a vertical magnetic field (the accelerators are constructed in 
the horizontal plane) to bend the trajectories of beam particles so they 
stay inside the beam pipe (with a good vacuum inside), close to the 
reference trajectory. But the vertical magnetic field is not enough. For 
example, it does not constrain particle movement along the field direc- 
tion, so any vertical component of the momentum, however small, would 
result in beam particles eventually escaping from the accelerator. One 
needs to have an arrangement of magnetic fields such that beam particles 
are effectively confined as if they were in a potential well that prevents 
them from going too far away from the reference trajectory. Quadrupole 
magnets are needed for this, together with other magnets for fine tuning. 
Only dipoles and quadrupoles will be considered here. 

A schematic view of a dipole magnet is shown in Fig. 3.10. The beam 
pipe, a continuous vacuum chamber, runs through the yoke gap, where a 
magnetic field B is created by electric currents in the two coils. A ‘warm’ 
iron yoke can be used for fields up to about 2 T. To avoid iron saturation 
effects and achieve higher fields, one needs superconducting dipoles (see 
Section 3.3). The dipole bending strength for a particle with momentum 
p and charge q is then given by the inverse of the radius of curvature p: 

: = ae ~ 0.3 Pir 
p P p [GeV/c] 


for q = e = the electron charge (3.16) 


In terms of the gap height h, the number of windings n/2 and the current 
I in each coil, 


_ ponl 


B 
h 


(3.17) 


where po is the permeability of free space. 

A schematic cross section of a quadrupole magnet is shown in Fig. 3.11. 
Four pairs of coils, with n windings and current J in each coil,!® create 
a magnetic field with components By = —gz and B, = —gzx in the 
horizontal and vertical directions, respectively, where 


2uonI 
J=- (3.18) 


and R is the distance shown in Fig. 3.11. The corresponding components 
of the Lorentz force acting on a particle with speed v are 


F; = quB; = —qugx and F, = —quB, = qugz (3.19) 


The important point to note is that in the vertical plane (containing the 
origin, i.e. the reference trajectory point) the force is acting away from 
the origin, while in the corresponding horizontal plane it acts towards 
the origin. One says that the quadrupole is focusing in one plane and de- 
focusing in another plane, perpendicular to the first. There is a complete 
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Fig. 3.10 Schematic view of a dipole 
magnet. 


13-This equation assumes that we are 
not using a warm iron core magnet, 
so it is only valid for superconducting 
quadrupoles. 
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Fig. 3.11 Schematic view of a 
quadrupole magnet. 
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14From the focusing perspective, the 
dipole magnet acts as a drift tube. Note 
that if l is too short, then the effective 
focal length will become too long. 


Fig. 3.12 FODO cell. 


analogy with geometrical optics, and for a quadrupole of length l, with 
quadrupole strength k, one can define its focal length f: 
1 
ee 
For the HERA proton ring, k ~ 0.033m7?, 1 ~ 1.9m, and f ~ 16m. If 
f > l, the quadrupole can be treated as a thin lens irrespective of the 
absolute value of l. 

A thin lens of focal length fı and another thin lens of focal length fa ar- 
ranged as a doublet of lenses separated by a drift tube of distance l form 
a focusing doublet with an effective focal length f (see Exercise 3.4b) 
given by 


ag, 9 [T/m] 
p GNV 


where k [m~?] = for q=e (3.20) 
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If one lens is focusing in the horizontal plane (f positive) and one de- 
focusing (f negative), then we can arrange for the effective focal length 
of the system to be positive. Such a quadruple doublet, with a drift 
space between the two quadruples, can be arranged to give focusing in 
both the horizontal and vertical directions. For example, let fı = fo and 
fe = — fq; then the effective focal length is given by f = fÈ /l in both 
the horizontal and vertical dimensions. This is the idea behind so-called 
‘strong focusing’, in which focusing and defocusing quadruples are ar- 
ranged in doublets with a dipole inside each doublet.!* A structure like 
this is called a FODO cell, as sketched in Fig. 3.12. A FODO cell focuses 
beam particles in both planes. FODO cells are put together one after 
another as a periodic structure along the whole ring of a synchrotron. 
Calculations of particle trajectories inside such a structure can be per- 
formed using the same techniques as used in geometrical optics; hence 
this aspect of accelerator physics is called ‘beam optics’. The concept of 
using repeating structures of FODO cells is called strong focusing and 
it keeps the transverse dimensions of the beam relatively small all the 
way around the ring, allowing the use of relatively small beam pipes 
and magnets. Before strong focusing was discovered, synchrotrons used 
‘weak focusing’, which resulted in much larger beam pipes. Strong focus- 
ing was another key development that allowed the construction of very 
high-energy synchrotrons at an affordable price. 


(3.21) 
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3.2.2 Beam trajectories and phase space 


The motion of beam particles is described in a curvilinear coordin- 
ate system, as sketched in Fig. 3.13. In a first, linear, approximation, 
particle motion in each of the three space directions (longitudinal s, ver- 
tical z, and horizontal x) can be considered separately. The transverse 
phase space is split into two 2-dimensional phase spaces. Consider- 
ing, for example, the vertical direction, we have the z-coordinate and 
pz-component phase space. As indicated in Fig. 3.14, the velocity z com- 
ponent can be described as a product of the angle with respect to the 
s direction and the speed along s, which is approximately the speed of 
light. So, effectively, for a given and constant Lorentz y factor, what 
matters is the angle, and the (z,p,) phase space can be replaced by 
(z,z’ = dz/ds) phase space. A similar argument applies in the other 
transverse direction. 
The strength of a quadrupole is defined by 


_ 1 dB; 
~ Bp dz 


(3.22) 


where p is the radius of curvature of the reference trajectory.!° The 
equations of motion are then Hill’s equations 


z" +k(s)z =0 (3.23) 
i 1 1 Ap 
T ko =| g= ap (3.24) 


We will only consider solutions for z and z’ (or for the horizontal phase 
space for Ap = 0, i.e. for the nominal momentum and at the limit of 
p — oo). This differential equation is reminiscent of simple harmonic 
motion, but k(s) is not a constant; it is a periodic function that defines 
the focusing strength at any point along the ring (eqn 3.22). It is ob- 
vious that if there were no focusing and k(s) = 0, then beam particles 
could escape unimpeded. If k(s) were constant around the ring, then 
the solution would be simple harmonic motion. This suggests the use of 
oscillatory trial functions similar to those for simple harmonic motion:!® 


(3.25) 


(:) a}. vev B(s) cosly(s) — po] 


VE tsinlo(s) — go] + a(s) cosle(s) — pol} 
p(s) 

The initial conditions determine the values of e and yo. The function 
G(s) defines the amplitude modulation, which varies because of the chan- 
ging focusing strength around the ring. From our trial solution, we can 
also derive the relationship between the function 6(s) and the magnetic 
focusing k(s). It is convenient to define 8 = w?, and we then find!” 


1 
w" — wat wk(s) = 0 (3.26) 
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Reference trajectory 


Fig. 3.13 Curvilinear coordinate 
system along the reference trajectory. 
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Fig. 3.14 A particle going from zı to 
z2 in the vertical direction. 


15th general, as we go around a ring (s), 
the magnetic focusing will vary, so we 
write k(s) to remind ourselves that k is 
not a constant. 


16 See Exercise 3.3 for a justification. 


17 See Exercise 3.3. 
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18 See Wille in Further Reading. 
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Fig. 3.15 The phase-space ellipse in 
the (z, z’) plane. 


20The horizontal and vertical Q values 
are not related to the Q-values of the 
RF cavities discussed in section in 3.1.3. 


21We will see how to evade Liouville’s 
theorem when we consider stochastic 
cooling in Section 3.6. 


In principle, this allows us to determine w(s) and hence 8(s) if we know 
k(s). However, this is not practical and matrix methods are used to 
determine 8(s).!8 The phase advance function ¢(s) is also determined 
by the focusing. The resulting oscillations about the reference trajectory 
are called betatron oscillations. The amplitudes of z and z’ are written 
using the amplitude function (s) and the emittance € of the trajectory 
(at the moment, we are talking only about one particle). The optical 
function a(s) = —8’(s)/2 and the phase function (s) are related to the 
B(s) function byt? y’(s) = 1/6(s). 

At a given point s along the reference trajectory, a particle has co- 
ordinates (z, 2’) in the vertical phase space. After one revolution, it will 
come back to the same s but with different coordinates (z, z’). After 
many revolutions the particle’s coordinates (z, 2’) will trace an ellipse 
in the (z,z’) phase space, as shown in Fig. 3.15. Integrating the phase 
function around the circumference C of the accelerator gives 


s+C 
f dy = 27Q, (3.27) 


where Q, is known as the betatron tune, the number of betatron oscil- 
lations for a particle going around the accelerator once (in this case the 
vertical tune Q.; similarly, there is a horizontal tune Q,).?° If there are 
any small imperfections in the ring, we need to avoid particles crossing 
these imperfections at the same betatron phase in each revolution— 
otherwise the beam would rapidly ‘blow up’. Therefore, integer values 
of the betatron tune should be avoided. More generally, tunes satisfying 


VQ: + uQz = (3.28) 


with v, u, and € integers must be avoided. 

The area of the (z, 2’) ellipse is we, so, up to the factor of m, the 
emittance is the volume of the two-dimensional phase space (to be 
more precise, here e, and similarly €y for the horizontal phase space). 
Liouville’s theorem states that under the action of conservative forces, 
the volume of beam phase space is conserved.?! Therefore, on moving 
from one point s to another along the reference trajectory, one would 
get another ellipse but with the same area. So the transverse motion 
of a particle can be visualized as an ellipse of fixed area changing its 
shape depending on the location in the accelerator. If there is another 
particle in the accelerator that is described by the same equations of 
motion but has a different initial phase yo, then the motion of that 
particle will be given by the same ellipse, since a different value of yo 
simply corresponds to another point on the same ellipse. So, in fact, 
one ellipse describes a family of trajectories, not just a single trajectory. 
From the algebraic point of view, this family of trajectories is described 
by the amplitude function 6(s) and the emittance e (see eqn 3.25). But 
the emittance is constant for ‘coasting’ (no acceleration) beam particles, 
which means that we have reduced the problem from two dimensions 
(z, 2’) to one dimension (8). On inspecting Fig. 3.15, we can see that 


the amplitude function (s) is the ratio of the beam width to the on-axis 
angular spread. 

Each ellipse represents a family of particles, and the whole ensemble of 
beam particles consists of many of these families and ellipses. How do we 
represent the whole ensemble? Considering the vertical phase space (the 
same argument applies for the horizontal one), a particle beam injected 
into an accelerator is characterized by initial conditions equivalent to a 
cluster of points in the (z, 2’) phase space, centred about the reference 
trajectory (0,0). We choose an ellipse that closely surrounds this cluster, 
and this represents the ‘edge’ of the beam. By convention, the ellipse 
should contain 95 % of particles. Then we follow this ellipse through the 
accelerator; the ellipse, the corresponding amplitude function 6(s), and 
the beam emittance e€ represent the properties of the whole beam. 

Using Liouville’s theorem, we see that as long as the beam is not 
accelerated, the (z, z’) and (z, pz) phase spaces are equivalent. But once 
the beam is accelerated, Liouville’s theorem applies only to the proper 
phase space (z, p+), and only the normalized emittance ey = 8ye (here 
B is the speed and y is the Lorentz factor) is conserved. The volume 
of the (z,z’) phase space shrinks with the momentum p as 1/p and 
consequently the beam width and the beam angular divergence shrink 
during acceleration, each as 1/,/p; so a higher-energy beam fits into a 
smaller-diameter beam pipe. This explains why high-energy accelerators 
require chains of lower-energy accelerators; each accelerator in the chain 
reduces the emittance sufficiently to allow the beam to have sufficiently 
small emittance to fit into the next accelerator in the chain. In principle, 
this accelerator chain could be eliminated if the beam pipe of the high- 
energy accelerator were sufficiently large; however, this would increase 
the size and hence the cost of the magnets. 

An example of such a chain is at the LHC [73], where the source 
of protons is a bottle of hydrogen gas. A high voltage is used to strip 
electrons to provide the protons. A linear accelerator (Linac 2) acceler- 
ates the protons to an energy of 50 MeV. The beam is then injected into 
the Proton Synchrotron Booster (PSB), which accelerates the protons to 
1.4 GeV, followed by the Proton Synchrotron (PS), which accelerates the 
beam to 25 GeV. Protons are then injected into the Super Proton Syn- 
chrotron (SPS), where they are accelerated to 450 GeV before injection 
into the LHC. 


3.3 LHC dipole magnets 


The LHC superconducting dipoles use conductors made from a niobium— 
titanium (NbTi) alloy.? For these dipoles [73], which are capable of gen- 
erating a magnetic field B = 8.3T, the current required is J = 11.85kA. 
This requires the NbTi superconductor to be cooled to a temperature 
of 1.9K using superfluid helium.?? In a type I superconductor, the cur- 
rent flows only on the surface, not in the bulk, which limits the useful 
magnetic field. Therefore, high-field superconducting magnets rely on 
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22NbTi is the only low-temperature 
superconductor that is ductile, and 
hence most existing superconducting 
magnets are based on this alloy, al- 
though there is interest in the niobium— 
tin alloy Nb3Sn, which might be able 
to produce larger magnetic fields. This 
would allow the option of an upgrade 
to the LHC to reach a CMS energy 
of 33 TeV. Nb3Sn superconductors are 
used in some high-field MRI magnets. 


23 This has the advantage of benefiting 
from the remarkable thermal properties 
of superfluid helium, which has an ef- 
fective thermal conductivity orders of 
magnitude better than copper. 
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24 Copper is effectively an insulator 
when the NbTi is in the superconduct- 
ing state. 


25 This type of cable is called Ruth- 
erford cable because it was developed 
at the UK Rutherford Laboratory, and 
it is used in all high-field supercon- 
ducting magnets. The cable is used in 
MRI scanners, so this is probably one 
of the most important but least known 
spin-offs from particle physics research. 


Fig. 3.16 Filaments (a), strands (b), 
and cable (c) of the type used for the 
LHC superconducting magnets, and a 
cross-section of one-quarter of the coils 
used in a main dipole (d). Note that the 
superconducting cable to create the di- 
pole field is placed in the small outlined 
boxes in (d). From [73]. 


type II superconductors, in which magnetic fluxoids can penetrate the 
volume. When there is a changing magnetic field in a superconductor, 
this will cause screening currents to flow. These are similar to eddy cur- 
rents, but as there is no resistance they do not decay with time. This 
magnetization appears as an unwanted error in the field produced by 
the magnet. The magnetization is proportional to the diameter of the 
wire carrying the current. When the magnetic field is changing with 
time, as happens when the beam energy is being ramped up from in- 
jection energy, an additional magnetization is created from the flow of 
current between neighbouring filaments. Therefore, a useful supercon- 
ducting cable has to be made from a very large number of very small 
filaments wound as ‘twisted pairs’ to minimize the magnetization. The 
cable for the LHC dipole magnets [73] is based on 6 um-diameter fila- 
ments. The filaments are embedded in a copper matrix for mechanical 
support.?4 Each strand is made from 6300 filaments and is 0.825 mm in 
diameter, and 36 strands are then used to make a cable, as shown in 
Fig. 3.16. 

For the same reasons as discussed above, it is important to minimize 
the flux linkage between wires. Twisting wires around each other as in 
a conventional cable is not sufficient, because the inner (outer) wires 
remain inside (outside). The wires need to be fully transposed; i.e. every 
wire must change places with every other wire along the length of the 
cable so that, averaged over the length, no flux is enclosed.?° 

To create a perfect dipole field, a distribution of current density 
varying as cos@ around the beam pipe would be required. However, 
a very nearly uniform dipole field near the centre of the beam pipe is 
created by blocks of superconductor arranged in the geometry shown 
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in Fig. 3.16(d). The currents in the blocks are optimized to produce a 
uniform magnetic field. 


3.3.1 Engineering design details 


As there is insufficient space in the LHC tunnel for separate magnets 
for each beam, a ‘two-in-one’ magnet was designed in which the two 
magnetic volumes are inside a common cryostat [73], as illustrated in 
Fig. 3.177" This magnet design is an amazing engineering tour de force, 6This also allowed for significant cost 
as the figure shows. In comparison, the dipole magnets for a pp collider savings compared with having two sep- 
ares traight forward. arate magnets and cryostats. 
The two beam pipes containing the counter-circulating proton beams 
are in the centre, surrounded by their respective dipole bending mag- 
nets with fields orientated to bend the two separate positively charged 
particle beams in opposite directions. The superconducting cable is 
held in place by non-magnetic ‘collars’ of austenitic steel, which can 
withstand a magnetic force of about 400 tonnes per metre of dipole. 
At nominal operation, the energy stored in each of the 1232 LHC di- 
poles is 6.93 MJ. If one small region of the superconductor becomes 
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Fig. 3.17 Cross-section of an LHC dipole in its cryostat. From [73]. 
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27 An additional source of danger is the 
electrical connection between the di- 
poles. This has a tiny but non-zero 
resistance. If this resistance is too large, 
this can also lead to catastrophic ther- 
mal runaway, as occurred on 19 Sep- 
tember 2008 and led to extensive dam- 
age. Many improvements have been 
made since then to prevent this type 
of problem. 


28 somewhat analogous to what one 
does with a garden hose to avoid dam- 
aging a sensitive plant. 


29 See Exercise 3.1. 


30 At the LHC, the pressure has to be 
kept below about 1077 Pa. 


non-superconducting (called a quench) for any reason, there will be 
Joule heating, which will increase the resistance, and hence there is the 
possibility of a catastrophic runaway, which would destroy the magnet. 
Therefore, sophisticated quench detection and protection systems are es- 
sential. Quenches can be detected by the extra ‘TR’ voltage drop. Once 
a quench is detected, a ‘quench heater’ is operated to force the entire 
magnet to become non-superconducting and the energy is transferred to 
a large ‘dump’ resistor.?” 

The energy stored in the two beams at nominal operation is 362 MJ, 
so they have sufficient energy to destroy large parts of the LHC machine 
and the detectors. Therefore, many beam loss monitors are installed in 
the machine, and if the rates rise above a threshold, kicker magnets are 
operated to deflect the beams out of the ring towards a beam dump [73]. 
The beam dump must be able to dilute the peak energy density of the 
beam before it is absorbed. At the LHC, this is done by ‘spraying’ the 
beam in a spiral pattern as it enters the dump.?° 


3.4 Colliders and fixed-target accelerators 


The centre-of-mass energy in a symmetric collider, with each beam hav- 
ing energy E, is simply given by ys = 22. For a fixed target collision, 
with a beam energy E and a target mass m (assuming E >> m), we can 
show”? that 

Vs = V2mE (3.29) 
Therefore, colliders have an obvious advantage in maximizing the 
centre-of-mass energies over fixed-target experiments. Although this 
was realized a long time ago, the challenge of achieving a useful 
interaction rate was formidable. The interaction rate for a given physics 
process depends on the luminosity (see Section 3.5). In a fixed-target 
geometry, it is only necessary for one high-intensity beam to collide with 
a block of matter to achieve very high luminosities. For a collider, this 
is far more challenging because we need two intense beams, which both 
have to be focused to very small transverse dimensions at the interaction 
points to achieve a useful luminosity. In a fixed-target accelerator, the 
beam must be kept for a few seconds before it is extracted. However, 
in a collider, it takes time to ‘fill’ the machine with sufficient numbers 
of particles before they are accelerated to the peak energy, and then 
the beams have to be kept for several hours while data are taken. This 
obviously puts much more significant demands on the quality of the 
accelerator. We must avoid the dangerous resonances and ensure that 
imperfections, which can increase the emittances of the beams, are kept 
to a minimum. We also need an extremely high vacuum to minimize 
beam losses and backgrounds in the detector.3° The defocusing effects 
of one beam on the other must also be kept under control. 


Some of these issues are common to all types of colliders. We al- 
ways want higher luminosity, and how to achieve this is discussed in 
Section 3.5. The special issue for circular e*e~ colliders is synchrotron 
radiation (see eqn 3.15), which puts a practical limit on the beam en- 
ergy. Therefore, large ete~ colliders like LEP have simple (and cheap) 
magnets but require very efficient RF cavities—which means supercon- 
ducting cavities (see Section 3.1). Hadron colliders do not suffer from 
significant synchrotron radiation losses, so we can have much higher 
beam energies. To optimize the beam energy for a given cost, we need to 
use the highest magnetic field possible for the dipole bending magnets. 
The critical technological challenge is the industrial-scale production of 
very high-quality superconducting magnets (see Section 3.3). 

In an ete" collider, the energy defines which processes can be stud- 
ied?! and which particles can be created or discovered. In a hadron 
collider such as the LHC, there is a more subtle interplay between en- 
ergy and luminosity. We are really interested in the rates for processes 
at the parton level, and the partons only carry a fraction of the momen- 
tum of the protons, so the energy reach of a hadron collider depends 
crucially on both energy and luminosity. Therefore, there is some com- 
plementarity between the very clean physics that can be performed at 
ete~ colliders, compared with the higher energy reach of hadron col- 
liders, in which the event reconstruction is more complicated because, 
in addition to the interesting parton—parton collisions, there are always 
interactions of the remaining ‘spectator quarks’. 

The HERA collider was a special case because it used e*p collisions. 
This required a high-energy proton ring (HERA I 820GeV, HERA II 
920 GeV) and a much lower-energy electron ring (27.5 GeV) to minimize 
synchrotron radiation.*? As will be discussed in Chapter 9, this gave the 
most precise determination of the quark and gluon distribution func- 
tions. These are required for all calculations of cross sections at hadron 
colliders like the LHC; the details will be covered in the same chapter. 

The LHC uses pp collisions, but earlier hadron colliders used pp col- 
lisions. The important advantage of pp colliders is that the two beams 
can be contained in the same beam pipe. The critical issue for these 
colliders was how to produce intense and low-emittance p beams. This 
will be discussed in Section 3.6. Colliders have also been operated using 
heavy ions, but these will not be discussed in this book. 


3.5 Luminosity 
The two most important numbers in experimental (accelerator) particle 


physics are the energy and the luminosity of an accelerator. The lumi- 
nosity £ translates a cross section ø into the rate R of produced events:3% 


ee fe (3.30) 
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31 Energy conservation implies that for 
a beam energy E, in a symmetric col- 
lider, the most massive particle that 
can be pair-produced will have mass 
m=E. In general, the sum of the 
masses of the final-state particles must 
be less than the centre-of-mass en- 
ergy (2E). 


32 The ep CMS energies were 300 and 
318 GeV, respectively. 


33 Or observed events, if detector effects 
are included. 
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34-This simple formula assumes that the 
target is ‘thin’, i.e. that the probabil- 
ity that an individual beam particle 
interacts in the target is much less 
than one. 


Transverse Transverse 
space (x,Z) space (x,z) 


Interaction 
point 


Area m E ny Area 
raf ; 


Fig. 3.18 Simple geometry for 
derivation of the luminosity formula. 
Symbols are explained in the text. 


35 This is not very realistic, since typic- 
ally the horizontal beam size is bigger 
than the vertical one. 


where Q is the solid angle for the angular differential cross section. In 
fixed-target experiments, 


L=nopl (3.31) 


where n is the number of particles per second in the beam (typically 
101? s71), p is the density of target particles, and / is the target length 
(typically, pl ~ 1073cm~?), giving a luminosity £L œ 105 cm~? s71, 
which is large in comparison with that at a high-energy collider.*4 

We begin by deriving useful formulae for the luminosity and then 
discuss how to optimize it. 

We start with the simple scenario sketched in Fig. 3.18, with two rect- 
angular bunches of particles colliding head on. They contain respectively 
nı and ng randomly distributed particles. A is the transverse area of the 
wider bunch and there are b bunches in each beam, with frequency of 
revolution f. Then 


nına 


L= 


bf (3.32) 


In a collider, n2/A corresponds to the fixed-target pl; it is the number 
of target particles per unit area. If particles are distributed in colliding 
bunches not randomly but according to normalized density distributions 
pı and p2, then 


L= bynina f P1P2 dS (3.33) 
S 


where S is the transverse space. Introducing beam currents Jı = n efb 
and Iz = nzefb, where e is the electron charge, 


ohh 
«bf Js 


Assuming Gaussian distributions with o, = o; =o for simplicity," 


zo ui (3.35) 
p12 = 2n07 » exp 2075 i 


P1P2 dS (3.34) 


we get 


pai 1 
~ e2bf 2n(a? + 03) 


(3.36) 


where 2r (0? + 03) represents an effective area. 

In order to illustrate how this relates to the accelerator parameters, 
we will assume that the vertical and the horizontal emittances are equal, 
€x = €, = €/T, and that the horizontal and the vertical beta functions 
are equal, 6, = 0, = 8*. We also assume that the two beams have equal 
emittance. The asterisk in 6* is the common symbol to indicate that the 
function is calculated at the interaction point (IP). As 


o? = ep* (3.37) 
Ib 


L= 4re?bfp*e 


(3.38) 


Equation 3.38 gives the best guide for understanding how to maximize 
the luminosity for a collider. First, it is clear that increasing the beam 
currents Jı and I is desirable. If only a limited number of protons can fit 
in one bunch, then it is advantageous to increase the number of bunches. 
However, the beam currents cannot be increased indefinitely, because 
each beam exerts electromagnetic forces on the other beam at each IP. 
The net effect of one bunch on a counter-rotating bunch is similar to 
that of an additional quadrupole magnet, and therefore changes the 
horizontal and vertical Q values. This is very dangerous because, even 
if the operating point of the machine is away from integer resonances (see 
eqn 3.28), such a tune shift can push the beams too close to a resonance, 
resulting in very rapid beam loss. This beam—beam tune shift thus limits 
the ultimate luminosity that can be achieved in a hadron collider.*” 

Note the rather counter-intuitive result of eqn 3.38 that if the beam 
currents are at the beam—beam limit, then the luminosity can be in- 
creased by decreasing the number of bunches. However, the optimization 
of the number of bunches must also respect practical constraints im- 
posed by the detectors. If the number of bunches is reduced, the number 
of collisions per bunch crossing will increase. Therefore, collisions with 
one interesting event will also contain a background of many other 
‘minimum-bias’ collisions, which are effectively a noise source.*® There 
is therefore a trade-off between maximizing luminosity and having clean 
enough events to be useful—and there is no perfect solution. At the 
LHC design luminosity of 1034 cm~? s71, there will be approximately 25 
collisions per bunch crossing (every 25 ns). 

The next parameters to optimize are the emittances. These depend on 
the quality of the proton source. Although Liouville’s theorem predicts 
the conservation of beam phase space, any imperfections can increase the 
emittances. Finally, one can increase the luminosity by decreasing the 
values of 3*. This is achieved by using very strong focusing quadrupole 
magnets, which will usually be superconducting to achieve the highest 
field gradients. The consequence of reducing the transverse beam size 
at the IP is that the beam divergences will increase. This limits how 
far B* can be reduced before the beam losses from particles hitting the 
beam pipe become unacceptable. This, in turn, implies that there is 
an advantage in bringing the quadrupole magnets closer to the IP, but 
as this will reduce the space for detectors in the forward region, the 
trade-off will depend on the particular physics being studied. 


3.6 pp colliders 


Hadron colliders can use two separate p beams as in the LHC. However, 
this requires two separate magnetic fields for the counter-rotating 
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36 This effect is called the ‘beam—beam’ 
tune shift. See Wille in Further Reading 
for a full explanation. The horizontal 
and vertical Q values are not related 
to the Q-values of the RF cavities dis- 
cussed in section in 3.1.3. 


3TThe situation is different in ete- 
colliders because of the beam ‘cooling’ 
from synchrotron radiation. 


38 How to cope with this is covered in 
more detail in Section 13.2. 
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39 They also demonstrated to scep- 
tical physicists that very clean re- 
sults could be obtained in this difficult 
environment. 


Poe = 


Fig. 3.19 Schematic view of a 
synchrotron with transverse pickup 
(TP), fast amplifier (A), and 
transverse kicker (TK) electrodes for 
stochastic cooling. 


40 Hence the name ‘stochastic cooling’. 


p beams. With counter-rotating beams of particles and antiparticles, 
the electric and magnetic fields are automatically correct for both 
beams, so only one ring is required. This enabled the relatively cheap 
conversion of proton synchrotrons at CERN and Fermilab into pp 
colliders. The use of pp colliders was very important as it led to the 
discovery of the W and Z bosons and the top quark, as well as providing 
the most precise measurement of the W mass.?9 

The big challenge for pp colliders was to produce very intense, low- 
emittance, beams of antiprotons, which was achieved using the technique 
of stochastic cooling [132]. Liouville’s theorem prevents a reduction in 
beam phase space, but it is based on the assumption that the beam 
is continuous, whereas a real beam is composed of a finite number of 
discrete particles. Consider first the extreme case of a single particle 
in a beam; we can detect its transverse position using a beam pickup 
electrode at one place in the ring, which generates a signal proportional 
to the displacement about the central orbit. The signal is fed across the 
ring via an amplifier to two deflecting plates as shown in Fig. 3.19. 

The betatron function at the deflecting plates is an odd multiple of 
m/2 out of phase with that of the pickup, so that a deviation in position 
from the reference orbit is converted to a difference in angle. The shorter 
path for the cable compared with the particle trajectory compensates for 
the difference in speeds of the particles and the electrical signals as well 
as the delay in the amplifier. Therefore, a suitable voltage pulse can be 
used to make a correction to bring the particle back to the central orbit. 

In a real beam with a large number of particles, this cooling tech- 
nique works on a statistical basis.4° If the speed of the amplifier were 
sufficiently fast, then each individual particle in the beam would have 
the correction applied to bring it back to the central orbit. Therefore, the 
cooling works better the shorter the sample time of the amplifier, since 
this determines how many particles are affected. Hence, the cooling rate 
depends on the bandwidth of the amplifier, W. As stochastic cooling 
requires the detection of fluctuations, it works faster for smaller num- 
bers of beam particles, N, and the cooling time r ~ N/2W. For useful 
bunches, we need N ~ 10!?, and with achievable bandwidth amplifiers 
this leads to a cooling time of the order of 1 day. 

It is also necessary to provide momentum cooling to reduce the spread 
in momenta. A pickup electrode can be used to measure the revolution 
frequencies of particles, and the signal is fed into an amplifier via a filter. 
The filter eliminates any signal for particles with the correct frequency 
(i.e. momentum), and higher frequencies give a phase shift of 7. This 
filtered signal can then be fed into an accelerating cavity. Again this 
system would work perfectly for individual particles and works on a 
statistical basis for beams with a finite number of particles. 


3.6.1 CERN pp collider 


The antiprotons were produced by collisions of 26 GeV/c protons from 
the CERN PS with a copper target. Some of the produced antiprotons 


with momenta ~ 3.5 GeV/c were collected by a large-aperture low-energy 
ring called the antiproton accumulator. The pulse of antiprotons was 
first cooled and then moved to the side of the aperture, where an intense 
stack of antiprotons was built up, allowing a new injection of antiprotons 
every 2.2s. The antiprotons were accumulated and cooled using stochas- 
tic cooling for about one day and then injected back into the PS and 
accelerated to 26 GeV/c; they were then injected into the SPS together 
with counter-rotating bunches of protons. The protons and antiprotons 
were then accelerated to an energy of 315 GeV (initially 270 GeV), anda 
run would last until sufficient antiprotons had been accumulated or the 
beams were lost. The peak luminosity achieved was ~2 x 108° cm7? s71. 


3.6.2 Tevatron pp collider 


Similar principles were applied to the Tevatron. For the second phase of 
the Tevatron (Run 2), a 150GeV synchrotron called the main injector 
was built to provide higher yields of antiprotons. The acceptance of 
antiprotons and the cooling were split between two separate machines. 
After cooling, the protons and antiprotons were re-accelerated in the 
main injector and then injected into the superconducting Tevatron and 
accelerated to an energy of 0.98 TeV. The peak luminosity achieved was 
al en-* s—, 


3.7 Future accelerators 


The RF technology for accelerating particles has been developed over 
many years, providing accelerators with steadily growing energies and 
luminosities. Proposed new accelerators, such as the International 
Linear Collider and the Future Circular Collider, are still based on RF 
technology. However, the size of these accelerators and their high cost 
demand a new, cheaper technology to allow the field of particle physics 
to move beyond current plans and also make applications affordable, for 
example as a source of ultrashort X-ray pulses to study the dynamics 
of chemical reactions. One such technology where progress in recent 
years has been particularly impressive is plasma acceleration. A plasma 
is an ionized gas such as hydrogen. By moving electrons away from 
quasistationary ions, it is possible to create electric fields up to three 
orders of magnitude larger than those obtainable using RF technology, 
thus allowing accelerators to be much smaller in size. A high-intensity 
laser pulse or a train of laser pulses or bunches of charged particles 
like electrons or protons can be used to separate electrons from ions, 
creating very high electric fields in plasmas. Electrons have been 
accelerated to energies above 4GeV in a 9cm-long accelerator in the 
BELLA laboratory at Berkeley,*! demonstrating the clear potential of 
this technology. The maximum beam energies reached or expected to 
be reached at future facilities are presented in Fig. 3.20, together with 
the longer-term prospects for plasma accelerators. 
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41 Berkeley Lab Laser Accelerator 
Center. 
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Fig. 3.20 The Livingston plot of the 
maximum beam energy for conven- 
tional RF and plasma accelerators. Im- 
age credit: R. Assmann. 
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Chapter summary 


This chapter has given a brief introduction to the field of accelerator 
physics. 


We have seen how RF cavities are used to accelerate charged particles. 


Dipole magnets are required for circular machines, but quadrupoles are 
also essential for beam focusing. The oscillations of beam particles about 
their central orbit have been explained and the beam emittance defined. 


Superconducting cables are needed for the highest-energy synchrotrons 
such as LHC. 


Colliders are the best route to the highest-energy collisions, provided 
sufficient luminosity can be achieved. A brief explanation of luminosity 
and its optimization has been given. 


The way in which Liouville’s theorem can be circumvented with the use of 
stochastic cooling has been reviewed. This was essential for the operation 
of pp colliders. 


Further reading 


e Wilson, E. (2001). An Introduction to Particle Accel- 


erators. Oxford University Press. This 
introduction to the subject. 


e Wille, K. (2000). The Physics of Particle Accel- 
erators: An Introduction. Oxford University Press. 
This is advanced 
subject. 


is a very good 


a more introduction to the 


e Bryant, P. and Johnsen, K. (1993). The Principles of 
Circular Accelerators and Storage Rings. Cambridge 
University Press. 


e The Staff of the CERN Proton—Antiproton Project 
(1981). First proton—antiproton collisions in the CERN 
SPS collider. Phys. Lett. B, 107, 306. 
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Exercises 


(3.1) 
(3.2) 


(3.4) 


(3.5) 


(3.6) 


Use relativistic kinematics to derive eqn 3.29. 


(a) Using the definition in eqn 3.7, show by partial 
differentiation that 


_ (af dv , af AR\ p 
TRE = \ Ou Op | OR Op) F 
(b) Show that 
ðv 1 
Op m 


(c) Combine the results from (a) and (b) to derive 
eqn 3.8. 

Differentiate twice the trial solution of Hill’s 
equation (3.25) in the z direction. 


(a) 
(b) Using the requirement that the coefficients of 
sin and cos must be identical, together with 
the results from (a), show that our trial solu- 
tion to eqn 3.24 is valid provided the condition 
¢' = 1/8 is satisfied. 

Verify eqn 3.26 by equating coefficients of the 
cos terms in Hill’s equation (3.24). 


Write down the matrices in the vertical direc- 
tion (z, 2’) for a thin focusing (F) and defocus- 
ing (D) lens and a drift tube (O) (i.e. with no 
magnetic forces). 

(b) Use matrix multiplication to evaluate the ma- 
trix for the combination FOD and hence derive 


eqn 3.21. 


Consider the quadrupole magnetic field given by 
B, = —gz and B; = —gz. Show that this field sat- 
isfies Maxwell’s equations in free space. Sketch the 
resulting magnetic field lines. Hence sketch the mag- 
netic forces acting on a positive particle travelling 
parallel to the beam axis (s). Consider the trajec- 
tory of such a particle after traversing such a thin 
quadrupole lens of length l. The particle has an ini- 
tial horizontal coordinate x = xo. Show that all such 
particles will have x = 0 after travelling a distance 
given by the focal length (eqn. 3.20). Hint: Assum- 
ing the lens is thin, you may neglect the change in 
x of the particle inside the quadrupole. 


A collider with two beams of unequal width has 
3x10!" particles per bunch and two bunches in each 
beam, and the frequency of revolution is 1 MHz. 


(3.8) 


Particles are uniformly distributed in cylindrical 
bunches that move parallel to the cylinder axis. 
The radius of each bunch in the wider beam is 
4 x 1074m. Calculate the luminosity. 


Describe a method to determine the luminosity in 
an e*e collider and explain what detectors you 
would need. Why can this technique not be used in 
a hadron collider? 


This question is about the measurement of lumi- 
nosity at a pp collider such as the LHC using Van 
der Meer (VDM) scans. Consider a collider with a 
revolution frequency frev and np colliding bunches. 
Let the numbers of particles in beams 1 and 2 be 
nı and nz and let the normalized bunch densities 
be pi(z,y) and p2(x, y). The luminosity is defined 
by the interaction rate for a process with a given 
cross section o by R = Lo. In terms of the beam 
parameters, the luminosity is expressed by 


L= nafemana | pile, y)pale, y) dz dy 


Explain the origin of this equation. Assume that 
the particle densities are uncorrelated in the x and 
y directions. The luminosity can then be written as 


L= Np frevnin2QsQy 


where the beam overlap is defined in x (with an 
equivalent definition in y) as 


2. = | pr(o)po(x) ae 


Assume that the beams have Gaussian distributions 
with a common centre and root mean squares 0x1 
and ox2. Show that 


1 
V2r/o7, + o25 


Let R(x) be the interaction rate as a function of the 
separation of the beams in the x direction: 


Q, = 


2 
x 

R(x) x exp |- = 

O x ao -ar Fez] 
Therefore, the value of 07; +025 can be determined 
from a VDM scan if the interaction rate for some 
particular process, R(x), is measured as the separ- 
ation between the two beams, x, is varied. Assuming 
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(3.9) 


that the fraction of bunch crossings that register 
a hit in a counter is p, show that the mean num- 
ber of collisions that produce such hits is given by 
u = —In(1 — p). Why is it advantageous to use a 
small counter? What is the problem with this tech- 
nique at very high luminosity? Suggest a possible 
detector technique that could be used for such a 
counter. 


Synchrotrons have a periodic ring-shaped lattice of 
focusing and bending magnets. Specifying the pos- 
ition of a beam particle on the circumference of the 
accelerator by the distance s, the equations of mo- 
tion in both horizontal and vertical directions about 
the ideal orbit have the form 

da 

—~+ K(s)x=0 

ds? (s) 
where K(s + L) = K(s) is periodic. The solution is 
a quasiperiodic function of the form 


x(s) = VB(s) cos[$(s) — ġo] 


where £ and ġo are constants. Both ¢ and (s) have 
dimensions of length. The phase ¢(s) is related to 
the beta function (s) by dé/ds = 1/8. Explain the 
significance of the beta function. 

Show that x and 2’ = da/ds satisfy 


a? + [B(s)e! +a(s)) _ 
Bls) 


where 


Reduce the expression to the standard form for an 
ellipse in (x, x’) space and show that its area is re. 
This is the emittance of the beam (in one transverse 
direction). Explain its importance for accelerator 
design and control. 


Particle detectors 


4.1 Introduction 


All the experimental discoveries that underpin our understanding of par- 
ticle physics rely on particle detectors, so a good knowledge of how these 
sophisticated devices work is essential. The complexity of particle de- 
tectors has grown enormously from very simple beginnings to the very 
powerful detector systems used at the LHC. As in the rest of this book, 
we will not take a historical approach but try to find the easiest and 
most direct way to explain the fundamental physics. We will start in 
Section 4.2 with an overview of a collider detector, focusing on what 
the requirements are and giving a simple description of how the differ- 
ent subsystems are used to identify some types of particles and measure 
the energy of individual particles or ‘jets’. This will give us a good idea 
of what a collider detector looks like, but it will tell us nothing about 
how any particular detector actually works. To gain any useful under- 
standing, we need to consider the basic detector physics that will explain 
quantitatively the performance of real detectors. 

We start this systematic approach in Section 4.3 by considering how 
high-energy particles interact with matter and lose energy. The processes 
result in a relatively small number of electron-ion pairs, so the next issue 
to consider is how we can use this effect to create a measurable signal.! 
The fundamental detector physics of how signals are generated will be 
described in Section 4.4, since this step is obviously essential for any real 
understanding of how a particle detector works. 

Armed with this knowledge, we can start to consider how basic particle 
detectors actually work. In Section 4.6, we will look at two techniques 
used for tracking the trajectory of charged particles: wire chambers and 
silicon detectors. Next, in Section 4.7, we will consider how to make 
energy measurements for charged and neutral particles in devices called 
calorimeters. 

To select interesting events for permanent storage, while rejecting very 
high rates of background processes, very powerful trigger systems are 
required. We will review these briefly in Section 4.10, with a particular 
emphasis on LHC collider detectors, since these present the biggest chal- 
lenges from the triggering perspective. Even with very powerful trigger 
systems, many petabytes of data are written to permanent storage every 
year at the LHC. Therefore, extremely powerful computer systems are 
required to process these data and to run the Monte Carlo simulation 
programmes used to understand the detector performance and correct 
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This is not strictly correct in silicon 
detectors, where we deal with electron— 
hole pairs. 
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? The biggest practical problem with 
very large computer farms is how to 
provide sufficient cooling to remove the 
heat. 


for the inevitable imperfections. This computing requires ~10° CPUs, 
which would be difficult to deal with in one facility.2 The problem has 
been solved by the use of GRID computing, in which the CPUs are dis- 
tributed over many computer centres across the world. GRID computing 
is now a major research area in its own right, but will not be covered 
further in this book. 

Having understood the basic building blocks, we will then look in 
Section 4.11 at how large particle detectors are designed and how they 
work. Here and in other chapters, we will use case studies of real de- 
tectors to see how the fundamental principles are applied in practice. 
Interestingly, we will see that there is no perfect solution to the many 
design challenges, and there are always difficult trade-offs in the design 
of any large detector. The discussion will focus on the design of the 
general-purpose LHC detectors, since these are the largest and most 
sophisticated detector systems ever built. We will also briefly consider 
neutrino detectors, since the constraints on these are not the same as 
those on collider detectors and the resulting systems are very different. 


4.2 Overview of collider detectors 


As an example of a collider detector, we will look at the general-purpose 
detectors at the LHC. As will be discussed in Chapter 13, the principle 
aims of the LHC are the study of the Higgs sector and the search for 
new physics beyond the Standard Model (SM), such as supersymmetry. 
Higgs bosons or any exotic particles will be heavy and will in general 
decay rapidly to SM particles, so we need to optimize the detector for 
energetic SM particles. We must be able to measure the momenta of 
photons, electrons, muons, taus, and hadron jets. As well as measuring 
the momenta, ideally we need to identify the different particles, which is 
non-trivial since the rates for hadron jets are O(10°) times higher than 
for leptons. We also need to distinguish between jets from b and c quarks 
and jets from light quarks. For neutrinos or exotic weakly interacting 
particles, such as SUSY WIMPs (see Section 13.4.1), direct detection 
is clearly impractical. However, we can infer the transverse momentum 
of these ‘invisible’ particles by using conservation of momentum. For 
this technique to be effective, we require detectors with calorimeters 
that cover most of the 47 solid angle (this technique is discussed in 
Chapter 13). 

A very schematic view of the principle components of a general- 
purpose collider detector is shown in Fig. 4.1. Working our way out 
from the centre of the detector, we can see how the different elements 
contribute to satisfying these requirements: 


e Tracker: This consists of very high-precision silicon detectors, im- 
mersed in a large magnetic field from a superconducting magnet. 
The trajectories of charged particles can be reconstructed, and 
hence the momenta can be computed. These detectors are used in 
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conjunction with the calorimeters and muon detector to identify 
and measure the momenta of electrons, muons, and taus. They 
can also measure the momenta of charged hadrons. The very high 
precision of these detectors allows good momentum resolution for 
high-momentum particles. It also allows b and c quarks to be identi- 
fied, using the fact that because of their relatively long lifetime, the 
trajectories of their decay products do not point back to the pri- 
mary vertex. The magnetic fields required, in the range 2—4 T, are 
created by superconducting magnets, which are based on similar 
technology to that used for the accelerator (see Chapter 3). 


Calorimeter: The first aim of the calorimeters is to provide high- 
precision measurements of photons and electrons. The second aim 
is to measure the energy of hadrons and so reconstruct hadronic 
jets. All particles apart from muons and weakly interacting par- 
ticles like neutrinos will deposit nearly all their energy in the 
calorimeters. In general, the energies are reconstructed from active 
detector elements interleaved with passive absorber material. For 
practical reasons that we will consider in Section 4.7, the calorim- 
eters are divided into electromagnetic (EM) and hadronic sections. 
Each type of calorimeter is further divided into small cells, which 
enables reconstruction of the transverse and longitudinal profiles 
of the energy deposition. This provides very powerful separation 
between electrons, which deposit nearly all their energy in a small 
region of the EM calorimeter, and hadronic jets, which produce 
deeper and wider showers. To reconstruct the missing transverse 
momentum, it is essential that the calorimeter cover a solid angle 
as near to 47 as possible.’ 


Fig. 4.1 Longitudinal and transverse 
views of a generic collider detector. 


3Two holes around the beam pipes are 
unavoidable, so there can be significant 
energy ‘lost’ down the pipes, but as the 
angles are very small, the transverse 
momenta are low; hence we can meas- 
ure missing transverse momentum but 
not missing longitudinal momentum. 
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4This is of course a very simplified 
picture—in reality, all these processes 
affect all detector types to some extent. 


5Here we are typically interested in 
particles with energies E >> 1 MeV. 


6 Charged particles can also lose energy 
by interacting with the atomic nuclei, 
but the energy transferred by the elas- 
tic scattering considered in this section 
is negligible compared with interactions 
with electrons because of the larger 
mass of the nuclei compared with elec- 
trons. Electrons also lose energy by 
bremsstrahlung, and this will be dis- 
cussed in Section 4.3.3. 


e Muon spectrometer: If the calorimeter is sufficiently thick, the 
majority of particles emerging from the calorimeters will be muons 
(ignoring neutrinos), because they do not tend to produce elec- 
tromagnetic showers like electrons or have hadronic interactions. 
The trajectories of the muons are measured in large wire cham- 
bers and can be matched to high-transverse-momentum charged 
particles measured in the tracker, reducing the effects of hadrons 
‘leaking’ out of the back of the calorimeter. If there is a magnetic 
field in the region of the muon chambers, the muon trajectory can 
be used to determine the muon momenta. Possible magnetic field 
configurations are considered in Section 4.9. The momenta of the 
muons can be measured independently in the tracker and com- 
bined with the measurement in the muon spectrometer to get the 
best precision. 


4.3 Particle interactions with matter 


This section covers the most important interactions of high-energy 
particles with matter that are needed to understand detector phys- 
ics. For tracking detectors, the most important process is ionization, 
since this generates the electron-ion pairs that we can detect. Multiple 
scattering is also important in tracking detectors because it limits the 
resolution. Electromagnetic processes such as pair production are fun- 
damental for understanding electromagnetic and hadronic calorimeters. 
Finally, hadronic interactions are obviously of particular importance for 
understanding hadronic calorimeters.* 


4.3.1 Ionization 


All charged particles interact with electrons in the atoms in any ma- 
terial in the detector. For high-energy particles,® the energy transferred 
to the electrons can be larger than the ionization energy, thus creating 
free electrons and positive ions. How to detect such secondary charged 
particles is discussed in this chapter. These collisions result in the in- 
cident particle losing energy in the lab frame (they are approximately 
elastic collisions in the CMS). We can understand the main features 
of ionization energy loss by starting from the formula for Rutherford 
scattering. The differential cross section (see Exercise 4.1) as a function 
of the 4-momentum transfer Q? and speed of the incoming particle 8 is 
given by 


do 4ra? 
— = —> (4.1) 
d? gp? 
where z is the charge in units of electron charge of the target particle 
by which the charged particle is scattered and a is the fine-structure 
constant. We can evaluate Q? in the rest frame of the electron before 
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the collision to be (see Exercise 4.2) Q? = 2meT, where T is the kinetic 
energy of the scattered electron. Then, changing variables in eqn 4.1, 


do Iz? a? 


dr == meB2T? (4.2) 


We then convert this expression for the energy loss in one collision to 
the average energy loss as a charged particle interacts with many atoms 
in some medium. The rate of energy loss per unit length in a medium 
with N atoms per unit volume and atomic number Z is 


dE meee do 
—=NZ T— dT 4.3 
a aoe (4.3) 


The minimum energy Tmin is related to the ionization energy J. We can 
calculate the maximum kinetic energy of the electron in the lab frame 
by considering a collision in the rest frame in which the direction of 
motion of the electron is reversed (see Exercise 4.2), which gives Tinax = 
26?y?m_-. Substitution into eqn 4.3 gives an approximate formula for 
the rate of energy loss of charged particles: 


dE 2nNZz202_ (27262 
a in (EM) (4.4) 


dx Me? I 
This shows that the energy loss initially decreases with increasing energy 


and then rises logarithmically with energy. Finally, this formula modified 
by relativistic effects is known as the Bethe formula” 


dE 2 Amey? 2 _ Sly) 
18 ae Za [in (2) _ p88) ay 


where K = 4r N ATr2Me (Na is Avogadro’s number and re is the classical 
radius of the electron) and Z and A are the atomic number and atomic 
mass number of the nucleus. It is conventional to express the stopping 
power in units of MeV g~! cm?. To transform this to the stopping power 
per unit length, we simply multiply by the density p. At relativistic 
energies, the electric field from the primary charged particle flattens 
and so allows collisions with more distant atoms. However, at very high 
energy, this effect is reduced by the polarization of the medium, which 
leads to the ‘density effect’ correction term ô( 8y). 

The mean energy loss for charged particles in different media as a func- 
tion of By is shown in Fig. 4.2. The important features of the stopping 
power are very similar for all targets: at low momentum, the stopping 
power decreases rapidly as the momentum of the incident particle in- 
creases, and then rises logarithmically at higher momentum. There is 
a broad minimum around Sy ~ 3 and the value of the minimum is 
typically in the range 1-3 MeV g7! cm?. Note that the energy loss by 
ionization scales with the Z of the material, which is very different to 
the Z? scaling that we find for pair-production and bremsstrahlung pro- 
cesses. We have discussed the mean energy loss, but there can be very 


TThis formula used to be called the 
Bethe—Bloch formula, but it is now 
usually called the Bethe formula. 
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Fig. 4.2 The mean energy loss by ion- 
ization in different materials as a func- 
tion of By [115] (where 8 and y are 
the usual relativistic factors). Note the 
units of MeVg-!cm?. To convert to 
linear stopping power, multiply by the 
density. 


8Here the effect of scattering off the 
atomic nucleus dominates over that 
from atomic electrons because of the 
larger charge on the nucleus. 
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Fig. 4.3 A charged particle 
undergoing multiple scattering in a 
material is deflected by an angle 0. 
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large fluctuations because of the large range of energies that can be lost 
in a single collision. The spread in the actual energy lost is given by the 
very broad ‘Landau’ distribution. Very large ‘tails’ in the distribution 
are caused by the emission of single relatively energetic electrons (called 
‘d-rays’). 


4.3.2 Multiple scattering 


When a charged particle traverses a slab of a material, as sketched in 
Fig. 4.3, it undergoes a very large number of very small-angle Coulomb 
scatters from the nuclei of the material. The net result of that is that 
the particle emerges from the slab at an angle 0 with respect to the 
initial direction. Considering many identical particles, one gets a dis- 
tribution of their angles 0 (in a plane like the plane of Fig. 4.3, or for 
any plane containing the initial direction vector) that follows a Gaussian 
distribution with standard deviation 


13.6MeV zx x 
Oo = 1 + 0.038 In ,/— 4.6 
Bp oe ( V Z) a) 
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where Xo is the radiation length (see Section 4.3.3), x is the thickness 
of the material slab, and p, 8, and q are respectively the momentum, 
velocity, and charge of the particle. The root mean square displacement 
of the particle trajectory y is then Yplane = (x/ V3)09. The effects of 
multiple scattering degrade the resolution of track reconstruction and 
therefore can have profound effects on the detector performance, as will 
be discussed later in this chapter. Note that the amount of multiple 
scattering scales with the amount of material expressed in radiation 
lengths. This provides one motivation for the design of tracking detectors 
that are ‘thin’ in units of radiation length. As the radiation length scales 
with Z~?, this shows that we should minimize the use of high-Z material. 


4.3.3 Electromagnetic interactions 


Electrons and positrons lose energy by ionization in a similar way to 
that discussed in Section 4.3.1 (although there are some differences, as- 
sociated with issues like the spin and identical particles for the case of T 
electrons). However, at high energy, the dominant process for energy loss 
is bremsstrahlung (see Fig. 4.4). For an electron of energy FE, the rate 


of change of energy due to bremsstrahlung as a function of distance x is a a 
Ben by Fig. 4.4 Lowest-order Feynman 
dE E diagram for bremsstrahlung 
eT = "xX (4.7) (eZ > eZy) in a material with nuclear 
a. 0 charge Ze. 


where Xo is the radiation length for the material. We can easily integrate 
eqn 4.7 to show that in travelling a distance Xo, the electron energy 
decreases by a factor of 1/e. An approximate formula for the radiation 
length is given by (see [115] for the full expression) 
1 4a? N. 

ao a Tia (4.8) 
Na is Avogadro’s number, Z is the atomic number and A is the atomic 
mass number (number of protons and neutrons). where for Z > 4, Lraq = 
In (184.15Z~1/3), We can see that the radiation length scales as 1/a? as 
expected because the Feynman diagram contains three vertices. The 
electron ‘sees’ the charge of the entire nucleus at one vertex, so the cross 
section scales with the atomic number as? Z?. The differential cross °This scaling with Z is more rapid than 
section for bremsstrahlung as a function of the variable y = k/E, where the linear scaling with Z that we found 
k and E are respectively the photon and electron energies, is given to a for energy loss by ionization: 
good approximation by 


do A 4 4 9 
= | 4.9 
dk XoNak (; aay ) (e 


The characteristic feature of eqn 4.9 is that the photon energy spectrum 
is peaked at low values. In one radiation length, it is very unlikely that 
the electron will lose energy to only one high-energy photon—it is far 
more common for it to lose energy to many lower-energy photons. 
High-energy photons can undergo pair conversion (see Fig. 4.5), which 
is clearly a process closely related to bremsstrahlung. At high energies, 
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Ze Ze 


Fig. 4.5 Lowest-order Feynman 
diagram for the pair production 
process for a photon interacting with a 
nucleus of charge Ze. 


10Note that the length for a primary 
electron to decrease in energy by a fac- 
tor f is Z of the length for which the 
probability of a photon not to pair- 
produce is equal to the same factor f. 


llWe can make an average correction 
to allow for shower leakage out of the 
back of the calorimeter, but there will 
always be shower-to-shower statistical 
fluctuations in the amount of leakage, 
which we cannot correct for. So, if 
we want a high-resolution electromag- 
netic calorimeter, we must ensure that 
it is deep enough for almost complete 
shower containment. 


This sets the natural size for the 
transverse granularity for electromag- 
netic calorimeters. We wish to separate 
electrons (positrons) from hadrons us- 
ing, among other measures, the trans- 
verse shower size. This improves with 
finer granularity, but we clearly do not 
gain by having cells with lateral dimen- 
sions much less than Ry. 


the differential cross section for pair production as a function of the 
fraction of the photon energy given to the electron, æ, is 


(4.10) 


do A 4 
dz XyNq Í i »)| 


We can integrate eqn 4.10 to obtain the total pair production cross 
section!? 


A 
XoNa 


7 
=- 4.11 
=; (4.11) 


At lower energies, the dominant process for energy loss by photons is 
Compton scattering ye — ye. Here the incident electron is approxi- 
mately at rest in an atom and it is ejected from the atom in the process 
(i.e., in the lab frame, energy is transferred from the incident photon to 
the outgoing electron). 

Now that we have considered the fundamental electromagnetic inter- 
actions in matter, we can understand the nature of the resulting 
electromagnetic showers. Incident high-energy electrons will lose en- 
ergy by bremsstrahlung and the resulting photons will create ete7 
pairs, which in turn will create more photons by bremsstrahlung. We 
need to consider the competition between the rate of energy loss from 
bremsstrahlung/pair production and ionization. The former increases 
approximately linearly with energy, whereas loss due to ionization in- 
creases only logarithmically. When the energy of the electrons decreases 
to the ‘critical energy’ Ee, the energy loss by bremsstrahlung will be 
equal to that by ionization. An approximate fit to the critical energy as 
a function of atomic number Z is given by 


610 


E; => 
Z+1.24 


MeV (4.12) 


As the electron (positron) energies become lower than Fe, they lose 
energy rapidly, become non-relativistic and lose energy by ionization 
even more rapidly, hence ending the shower development. This results in 
the shower depth varying logarithmically with energy (see Exercise 4.4). 
The longitudinal shower profile can be calculated rather accurately using 
Monte Carlo simulations and an example is shown in Fig. 4.6. We require 
nearly complete shower containment to obtain good energy resolution, 
so for 30 GeV electrons we need a depth of at least ~20Xọ.1! 
Electromagnetic showers broaden as they penetrate deeper into matter 
owing to multiple Coulomb scattering of the electrons (positrons) and 
the scattering angles involved in bremsstrahlung and pair production. 
The first effect dominates and we can parameterize the width of the 
shower by the ‘Molière radius’ Ry = XoF,/E., with Es ~ 21 MeV. 
Approximately 90% of the energy is contained within a radius of Ry,.!? 
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4.3.4 Cerenkov radiation 


When a charged particle moves at a speed v greater than the local phase 
velocity of light, 1/n, where n is the refractive index of the medium, it 
will emit Cerenkov photons. The angle of the Cerenkov photons relative 
to the charged particle can be calculated using a simple geometrical 
argument (see Fig. 4.7). In time t, the relativistic particle travels from A 
to B, a distance of vt. The electromagnetic wave emitted by the particle 
from A is travelling at the (lower) speed of 1/n. The wavefront, defined 
by the plane with a constant phase, is given by the line from C to B. 
Hence the Cerenkov angle is given by cos ĝc = 1/(nv).!3 The photons 
are typically in the optical range and can be detected in a similar way 
to that used for scintillation light (see Section 4.4.2). 


4.3.5 Transition radiation 


If a high-energy charged particle crosses a boundary between two me- 
dia with different dielectric constants, it can emit transition-radiation 
photons. The yield depends on the Lorentz factor y and therefore allows 
the separation of electrons from charged hadrons. The yield per interface 
is O(a) and is therefore very low, implying that a practical transition- 
radiation detector requires hundreds of interfaces, which can be achieved 
for example with Mylar foils. 


4.3.6 Hadronic interactions 


High-energy hadrons undergo nuclear interactions in matter. The physics 
involved cannot be calculated from first principles, so phenomenological 
models are needed. It is useful to define the interaction length Ar as the 
length in a material in which the probability of a hadron not interacting 


Fig. 4.6 Simulation of the longitudinal 
shower profile for incident 30 GeV elec- 
trons and photons on iron [115]. The 
histogram shows the energy deposition 
and the circles (squares) indicate the 
number of electrons (photons). The 
photons penetrate more deeply than 
electrons, reflecting the factor of 5 in 
eqn 4.11. 


Fig. 4.7 Geometrical construction for 
calculation of the Cerenkov angle. 


13We are using natural units with c = 1 
and we have assumed that the medium 
is non-dispersive. 
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14This provides another motivation for 
using high-Z absorbers in an electro- 
magnetic calorimeter (apart from wish- 
ing to limit the depth required for 
good shower containment): electromag- 
netic showers are contained in a shorter 
depth than hadronic showers and the 
separation is better for higher-Z ab- 
sorbers. 


Fig. 4.8 Measured shower profiles for 
high-energy pions in the CDHS de- 
tector [88]. 


Element Xo (gcm~?) Ar (gem~?) p (gcm~ 3) 
Tron 13.84 132.1 7.874 
Copper 12.86 137.3 8.960 
Lead 6.37 199.6 11.350 
Uranium 6.00 209.0 18.950 


Table 4.1 Radiation length Xo, interaction length Ar, and density p for some 
elements. 


is 1/e. The cross section at high energy for scattering of a hadron on 
a nucleus scales like ø = Rọ A?/3, which is quite different to the Z? 
scaling for bremsstrahlung and pair production.!4 The interaction length 
is compared with the radiation length for a few common absorbers used 
in calorimeters in Table 4.1. The longitudinal shower profile for high- 
energy pions in iron [88] is shown in Fig. 4.8. We see that for good 
shower containment, we need a depth of about 10A;, which results in 
very large calorimeters. This obviously increases the cost of the hadronic 
calorimeter itself, but also increases the radius for the start of the muon 
detectors, and thus increases the area and cost of the muon spectrometer. 
Therefore, for cost reasons, there will usually be some significant energy 
leakage out of the back of a hadronic calorimeter. As for electromagnetic 
calorimeters, we can make an average correction for this effect, but the 
statistical fluctuations will degrade the resolution. 

A high-energy hadron interacting with a nucleus will create a mixture 
of charged and neutral hadrons. The m?s will decay rapidly to photons 
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and thus induce electromagnetic showers. The charged hadrons produced 
will penetrate further into the calorimeter and create secondary hadronic 
interactions, leading to the development of hadronic showers deep into 
the calorimeter. The big difficulty with hadronic calorimetry is that a 
significant fraction of the energy goes into nuclear breakup and evap- 
orating neutrons and protons from the nuclei. The resulting low-energy 
nuclei and protons will be very heavily ionizing and lose energy rapidly. 
Typically, this will occur in the passive absorber,!° producing no detect- 
able signal in the ‘active’ layers. The low-energy neutrons will scatter 
and thermalize on a timescale of microseconds, and so any photons pro- 
duced from neutron capture will be outside the time ‘window’ for signal 
collection. The fraction of energy that is effectively ‘lost’ in a hadronic 
interaction due to these processes is very large (typically in the range 
20-40%). The real problem is that there is a very large variation in 
this lost fraction from shower to shower, which greatly degrades the 
resolution of hadronic compared with electromagnetic calorimeters. The 
magnitude of the effect can be parameterized by the ratio of the response 
to electrons to that to hadrons, e/h. If e/h is significantly different from 
unity, the calorimeter resolution will be limited and there will be large 
non-Gaussian fluctuations. Several ideas have been pursued to try to 
achieve ‘compensating’ calorimeters in which e/h ~ 1 and these will be 
discussed in Section 4.7.5. 


4.4 Signal generation 


In Section 4.3, we have considered how particles lose energy in matter 
and create showers of secondary charged and neutral particles. We now 
need to examine how we can actually detect these secondary particles as 
well as the particles from the primary collision. In Section 4.4.1, we will 
see how charged particles moving between electrodes induce currents, 
which we can amplify and read out with suitable electronics. Another 
approach, considered in Section 4.4.2, is to use scintillation light. The 
scintillation and Cerenkov processes result in photons in the visible or 
ultraviolet wavelengths, so in Section 4.5 we review techniques to detect 
these photons. 


4.4.1 Moving charges 


In this section, we explain how to calculate the induced currents cre- 
ated by moving charges, which generate the electrical signals we can 
measure in detectors like wire chambers or silicon detectors. We first cal- 
culate the induced current for a simple case and then discuss the general 
solution. 

Consider a charged particle held between the two (infinite) plates of 
a parallel-plate capacitor, with both plates grounded. The potential is 
given by the solution of Laplace’s equation [76], subject to appropriate 
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15For cost reasons, hadronic calorim- 
eters are divided into alternating layers 
of ‘passive’ absorber and ‘active’ layers 
that detect the signal. 
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boundary conditions (the potential is 0 on the plates and approximates 
that from a point charge in the vicinity of the charge): 


Vio) = g a p(er e 
n=1 


where zo is the distance from the lower plate to the point charge, L is 
the separation between the plates, p = ,/x? + y? (where x and y are 
the Cartesian coordinates of the point charge in the plane of the lower 
plate—see Fig. 4.9), and Ko is a modified Bessel function. The solutions 
for three locations of the charge are illustrated by the equipotentials 
shown in Fig. 4.9. 

The induced electric surface charge density on the conducting plate at 
z = 0 is given by o = €9|E,(z = 0)|, where E = —VV is the electric field 
evaluated at the edge of the conductor (z is the direction perpendicular 
to the conductor). When the charge is near the upper (lower) plate, 
we see that the equipotentials are more tightly packed near the upper 
(lower) plate. Therefore, when the charge is near the upper (lower) plate, 
the E field will be larger nearer the upper (lower) plate and hence there 
will be a larger induced charge on the upper (lower) plate. The fields and 
induced charges are obviously symmetric when the charge is equidistant 
from the two plates. Now let us imagine moving the charge from near the 
upper plate to near the lower plate. Initially, most of the induced charge 
will be on the upper plate, but this will gradually change and at the end 
most of the induced charge will be on the lower plate. This then looks like 
a current flowing between the two conductor plates. This is a qualitative 
example of the fundamental result in detector physics; moving charges 
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Fig. 4.9 Equipotentials (arbitrary units) 
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for a point charge at three different locations in a parallel-plate capacitor: (a) near 


the upper plate; (b) near the centre; (c) near the lower plate. The equipotentials near the point charge are omitted for clarity. 


between conducting electrodes induce currents.© This current can be 
amplified and digitized by appropriate readout electronics. 

Now that we have seen a qualitative description of the physics of 
induced charges, we can look at the quantitative solution. Taking the 
derivative in the z direction of the potential (eqn 4.13), we can determine 
the induced surface charge density on the upper and lower plates using 
Gauss’s law: 


rive 0)— — as a (M) Ho (SP) 


az= = ls Ta sin (Ee) Ko (F) 


n=1 


(4.14) 


We can integrate the surface charge density to find the total charge 
induced on the upper plate as 


Qu = 2r | ao(p,z = L)pdp 
0 


“4 D ei sin (22) m zKo(z)dz (4.15) 


n=1 


II 


The integral is equal to unity, so 


Qu = “4 3 p sin (25) (4.16) 


The infinite sum is related to a Fourier series (see Exercise 4.3), so 


970 
== 4.17 
Qu T, (4.17) 
We can calculate the surface charge on the lower plate at z = 0 by the 
same method, to obtain!” 


q(L — zo) 


Q=- 


(4.18) 
Now let the charge between the capacitor plates move with a speed v 
in the negative z direction. The induced charge flows from the upper to 
the lower plate and the current (while the charge is moving) is given by 
the rate of change of charge as 


qu 

[=-— 4.19 

i (4.19) 

We have determined the induced current for the simplest possible 
geometry. 

A more general solution to the calculation of the induced current, 

which is indispensable for understanding realistic detector geometries, 
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16Note that the induced signal occurs 
as long as the charge is moving and 
stops when the charges are collected on 
the electrodes. A popular misconcep- 
tion is that the signal only arises when 
the charge is ‘collected’ at an electrode. 


17 The total induced charge on the two 
plates is —q, as expected. 


84 Particle detectors 


18 S66 Spieler in the Further Reading at 
the end of this chapter for a derivation 
of Ramo’s theorem. 


19Note that this field is not the same 
as the electric field and does not even 
have the same dimensions. 


20 Only in the case of two-electrode sys- 
tems does the weighting field have the 
same form as the physical electric field. 


2l This result is independent of the 
forces causing the charge to move with 
a velocity v. In a particle detector, the 
motion is due to the applied electric 
and magnetic fields and the interactions 
of the moving charge with atoms or 
molecules in the detector. 


22 This is clearly a problem for an appli- 
cation requiring large-area scintillators. 


is provided by Ramo’s theorem.18 This will provide us with a simple 
method for calculating the induced currents from any movements of 
charges and is therefore of fundamental importance in detector physics. 
First, we set the potential on the electrode being considered to 1 V and 
apply 0 V to all other electrodes and calculate the potential ® by solving 
Laplace’s equation subject to these boundary conditions. The ‘weighting’ 
field!® is defined as 

Ew = -V® (4.20) 
The current induced on this electrode, caused by the motion of n 
charges qj, moving with velocities v; (j = 1,...,n), is given by 


i=- X qjvj: Ew (4.21) 
j=1 


The velocity depends on the real electric field, not the weighting field.?° 
As a simple ‘sanity check’, we can now use Ramo’s theorem to calculate 
the induced current for the case of a point charge between the plates 
of an infinite parallel-plate capacitor and compare it with the result 
obtained above. For this geometry, if we apply 1 V on one electrode and 
OV on the other electrode, then the weighting field is uniform and has a 
magnitude of 1/L. For a point charge q moving with velocity v parallel 
to this weighting field, we obtain the induced current from eqn 4.21 as 


qu 


I= 
L 


(4.22) 


which is in agreement with eqn 4.19.7! 


4.4.2 Scintillators 


Scintillators are materials in which ionizing particles can cause scintilla- 
tion light, which can be detected by photodetectors. There are two broad 
classes of scintillator: organic and inorganic. A common example of an or- 
ganic scintillator is polystyrene. In an organic scintillator, molecules are 
lifted into an excited state by an ionizing particle and de-excite by emit- 
ting scintillation photons (typically in the UV). The problem with this is 
that the reverse reaction has a large cross section, so these UV photons 
will be rapidly re-absorbed.?? This problem is solved by introducing a 
dopant so that these photons are absorbed by a fluorescent molecule 
(a ‘fluor’). The fluor then decays rapidly to a lower-energy state via a 
radiative decay, emitting longer-wavelength photons. This increases the 
attenuation length, but it is usually still too short for practical appli- 
cations. Therefore, a secondary fluor is used to shift the photons into 
the visible wavelength range, and these photons can have a suitably 
long attenuation length. The typical scintillation and fluorescence pro- 
cesses [115] are illustrated in Fig. 4.10. This type of organic scintillator 
is often used in sampling calorimeters (see Section 4.7). 
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A classic example of an inorganic scintillator is thallium-doped sodium 
iodide, NaI(T1). A high-energy particle can excite an electron from the 
valence to the conduction band. The electron can drop from the con- 
duction to the valence level with the emission of a photon. However, the 
reverse process will result in too short an attenuation length for a useful 
detector. Therefore, a different process is used in which high-energy par- 
ticles create excitons (loosely bound states of an electron and a hole). An 
exciton can move through the crystal until it is captured by an impurity 
state (created by the doping with T1), which can then decay via emission 
of a photon, thus creating scintillation light.2 This has the advantage 
of high density, which allows the construction of a more compact cal- 
orimeter, thus reducing the cost, and it also has a very good yield for 
scintillation light. This scintillator is still used in many applications and 
it was used in older particle physics detectors. The problem is that it 
is too slow for use at modern colliders, because the scintillation decay 
time is 250 ns, which is much longer than the time between collisions 
at the LHC of 25ns. To use an inorganic scintillator at the LHC, we 
need a very fast decay time. Also, the scintillator must be very tolerant 
to radiation—most scintillators would become opaque after exposure to 
LHC radiation levels. Such a scintillator, PhWO,, has been developed 
for the CMS electromagnetic calorimeter; its use there will be described 
in Section 4.7.2. 


4.5 Photon detection 


We have seen that scintillation and Cerenkov radiation result in photons 
in the range from the optical to the UV, which we have to convert into 
an electrical signal that can be digitized and read out. The traditional 
method is based on photomultipliers (PMTs), but another technique 
that is becoming increasingly common uses avalanche photodiodes. 
A schematic illustration of a photomultiplier coupled to a scintillator 
is shown in Fig. 4.11. A photomultiplier has a photocathode, usually 
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Fig. 4.10 Scintillation and fluores- 
cence steps in an organic scintillator. 
[115]. Typical values are given for 
the wavelength and absorption length 
of the photons. The first step in the 
process, ‘Foerster energy transfer’, 
does not involve photon emission but 
is a dipole-dipole interaction between 
the base and the primary fluor. 


23 As the doping concentration is rela- 
tively low, the probability of the scin- 
tillation light being reabsorbed in the 
crystal is very low; i.e. the crystal is 
transparent at this wavelength. 
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Fig. 4.11 Schematic view of a photo- 
multiplier and the main processes 
involved. The primary photon is 
emitted from the photocathode and is 
accelerated and focused until it hits 
the first dynode. It then liberates 
many secondary electrons, which are 
accelerated to the next dynode. 
The resulting induced current is 
detected on the anode. From https:// 
commons. wikimedia.org/wiki/File: 
PhotoMultiplierTubeAndScintillator. 


Jpg 


24This is called a photoelectron. 


25 Additional electrodes act as electro- 
static focusing elements to increase the 
fraction of electrons collected at the 
first dynode. 


26 For operation in moderate magnetic 
fields, PMTs can be shielded by shields 
made of ‘mu-metal’, an alloy with an 
exceptionally large relative permeabil- 
ity. However, saturation effects prevent 
this technique from working in high 
magnetic fields. 


27 Silicon is one possible semiconductor 
that can be used for photodiodes, but 
there are photodiodes made from other 
semiconductors such as GaAs or In- 
GaAs. The optimal choice for any ap- 
plication depends on several factors, in- 
cluding wavelength, speed of response, 
and cost. 
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containing two alkali elements to obtain the best quantum efficiency. 
When a photon with an energy greater than the work function hits the 
photocathode, it can emit an electron by the photoelectric effect.24 The 
resulting electron is then accelerated by an applied electric field until 
it hits the first dynode.”° This causes the emission of several secondary 
electrons (the electron has been accelerated so it has sufficient energy to 
do this). The secondary electrons are similarly accelerated and strike the 
second dynode. This clearly multiplies the number of electrons (hence 
the name photomultiplier). Several stage of dynodes are used and it is 
easy to obtain a very large gain (~10° or more). A single photon thus 
creates a large pulse of electrons that is easy to detect and digitize. 
One disadvantage of PMTs is that they do not work in large magnetic 
fields.?° 

The simplest solid state photodetector is a photodiode. In a photo- 
diode, photons create electron-hole pairs in a detector working on the 
same principles as that of a silicon detector (see Section 4.6.2).?” The 
problem is that the small signal results in a low signal-to-noise ratio. In 
an avalanche photodiode (APD), the electric field is large enough that 
electrons acquire sufficient energy to create further electron-hole pairs, 
leading to an ‘avalanche’ effect. This avalanche process creates an intrin- 
sic gain in the device that results in APDs having better resolution for 
small calorimeter signals than simple photodiodes. This requires that a 
larger reverse bias be applied, typically ~100 V, which results an ava- 
lanche gain in the range 10-100. The gain of an APD is more sensitive to 
the applied bias voltage and the temperature than that of a simple pho- 
todiode. In addition, the design needs to ensure that the avalanche does 
not lead to electrical breakdown. One key advantage of APDs for par- 
ticle physics applications is that they are insensitive to applied magnetic 
fields. 


4.6 Detectors for charged-particle tracks 


The traditional technology for tracking used wire chambers. These have 
been largely replaced by silicon detectors for the inner ‘trackers’ in LHC 
general purpose detectors. However, the cost of silicon detectors would 
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be prohibitive for the very large-area outer detectors needed for the 
muon spectrometers, so wire chambers are the only practical technology 
for these systems. 


4.6.1 Wire chambers 


A primary high-energy charged particle passing through a gas will create 
a few electron-ion pairs by ionization. To create a sufficiently large signal 
(i.e. greater than the electronic noise of an amplifier), we need to use 
an avalanche process. We start by considering the ‘gas gain’ caused by 
an avalanche, then consider the simple proportional wire chamber, and 
finally look at a ‘drift’ chamber. 


Gas gain 


At sufficiently high electric fields (~100kV cm~'), electrons drifting in 
an electric field acquire sufficient energy to cause further ionization in 
the gas and thus enable an avalanche process that can result in a very 
large increase in the number of electron-ion pairs. We define the gas 
gain G = N/Npo, where No and N are the initial and final numbers of 
electron-ion pairs. The change in N with a distance travelled ds is 


dN = Nads (4.23) 


where a is called the first Townsend coefficient and has to be measured 
experimentally. We can integrate eqn 4.23 for the gas gain: 


= n/t = ep f ads) 


Emax 
(a4 
= ae 4.24 
aL ( dE/ds (4.24) 


where F is the electric field, Emin is the value of E at the start of the 
avalanche, and Emax is the value at the end of the avalanche (e.g. at 
the wire in a wire chamber). The value of Emin is simply related to the 
mean free path for electrons À and the average ionization energy I by 
conservation of energy: eEminà = I. 

We can now summarize the general features of the gas gain as a func- 
tion of the applied voltage across a chamber as illustrated in Fig. 4.12. 
At very low voltages, the electrons recombine with ions before they are 
collected. At higher voltages, we can distinguish different regions: 


e Ionization chamber: In this region, the electrons do not acquire 
sufficient energy to start an avalanche and we just see the signal 
from the primary electron-ion pairs, i.e. there is no gas gain. 

e Proportional regime: In the ‘proportional’ regime, there is gas 
gain and the number of electron-ion pairs created by the ava- 
lanche is proportional to the number of primary electron-ion pairs. 
Typical values of gas gain are in the range 104-105. 
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Fig. 4.12 Variation in gas gain as a 
function of applied voltage [125]. 
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e Limited proportionality: At higher voltages, the density of the 
electron-ion pairs is so high that after the lighter electrons have 
drifted some distance, the net space-charge density is so large that 
it decreases the field, thus lowering the gain. 


e Geiger—Miiller mode: The highest voltages result in the Geiger— 
Müller mode with very large gains ~101°. For operation in this 
regime, to avoid complete electrical breakdown, we require a very 
large series resistance for the high voltage to lower the latter when 
the current gets too high. This creates a long recovery time between 
pulses and is therefore not useful for high-rate applications. 


The actual calculation of the gas gain depends on dE’/ds, which clearly 
depends on the geometry used to create the field. For example, we can 
calculate the gas gain for the case of the proportional wire chamber to be 
discussed in detail below. Substituting for the electric field from eqn 4.27 
into the gas gain equation (eqn 4.24), we can show that the gas gain is 


Emax a 
G-opfy e o ar) (4.08) 


If we use the linear approximation that a(E) ~ BE, where 6 is an 
empirical constant, then we can integrate eqn 4.25. Taking Emin = I/Ae 
and Emax = V/[In(b/a) a], we can show that 


C= Lava™ (area) (4.26) 
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This allows us to understand the rapid and approximately exponential 
rise of the gas gain with applied voltage that we saw in Fig. 4.12 for the 
proportional regime. 


Proportional wire chambers 


Figure 4.13 shows the geometry of a cylindrical proportional chamber. In 
a typical arrangement, there is a thin anode wire at a high-voltage (HV) 
potential of a few kilovolts on the axis and the cylindrical cathode is 
at ground potential. The wire has a radius of 10-20 um. Assuming that 
the length of the wire is much greater than its diameter (a very good 
approximation), we can easily calculate the magnitude of the electric 
field from Gauss’s law. Taking a cylindrical surface around the wire, we 
can show that the magnitude of the electric field is given by 


V 
E= ln(b/a)r 


(4.27) 
(see Fig. 4.14), where V is the potential difference between the anode 
and cathode, a and b are the radii of the anode and cathode, respectively, 
and r is the radial distance from the centre of the anode wire. The cell is 
filled with a gas. A common, cheap, and safe choice for the gas is a 9:1 
argon and CO% mixture: the noble gas has the advantage of chemical 
inertness, so the electrons liberated by ionization will be able to travel 
without being absorbed (the role of the COz is explained below). 

A charged particle crossing the cell ionizes the gas, creating about 40- 
60 electron-ion pairs per centimetre. This number of electron-ion pairs is 
then increased by a factor of 2-3 because some electrons have enough en- 
ergy to ionize the gas further. Electrons drift towards the anode and the 
much slower (massive) ions drift towards the cathode in a diffusion-like 
process. Very close to the anode, a few times the anode wire diameter, 
the electric field is high enough (the anode wire is thin) to accelerate 
the drifting electrons to energies allowing further gas ionization, and this 
leads to an avalanche process (see the discussion of gas gain earlier in 
this section).?° There is also recombination of electrons and ions with 
emission of UV photons. These photons, if not absorbed, could eject 
electrons from the cathode, leading to a continuous electric discharge. 
The role of the CO2 (or another gas with molecules with many degrees 
of freedom) is to absorb the UV photons and transform their energies to 
molecular vibration or rotation, which then decay via emission of longer- 
wavelength photons. These longer-wavelength photons have too low an 
energy to eject electrons from the cathode. In this typical arrangement, 
the cell operates in the proportional regime. 

Many different types of ‘wire’ chambers have been developed. They 
do not necessarily even have to contain wires, but they all rely on a large 
electric field to create an avalanche and they detect the induced currents 
caused by the drifting electrons and positive ions. These are described 
in the references in Further Reading. 
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Fig. 4.13 A fundamental cell of a 
wire chamber. Not to scale. 
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Fig. 4.14 Electric field inside the 
fundamental cell of a wire chamber. 


?8Flectrons and ions are accelerated 
by the electric field, but also undergo 
many collisions with gas atoms and 
thereby acquire a uniform ‘drift vel- 
ocity’ superimposed on their random 
motion as in a conductor. For our pur- 
poses, we can ignore the random mo- 
tion and just consider the drift velocity. 
However, the random motion contrib- 
utes to diffusion and is one of the 
factors limiting the resolution of wire 
chambers. 
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Fig. 4.15 A multi-wire proportional 
chamber (MWPC). 
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Fig. 4.16 A drift chamber and its 
fundamental cells. 


29 Typically a few cm per us. 


391 is often a little more complicated, 
because, in the presence of a magnetic 
field, electrons do not drift along lines 
of the electric field in the drift chamber 
but at an angle to them, known as the 
Lorentz angle. 


We will consider two common types of wire chambers: 


e multi-wire proportional chambers (MWPCs); 


e proportional drift chambers (drift chambers). 


A typical example of an MWPC, as sketched in Fig. 4.15, consists of a 
plane of anode wires between two planes of cathodes (sometime cathode 
wires). Such chambers are often used in fixed-target experiments where 
charged particles are crossing chambers close to perpendicular to their 
anode planes. If the spacing between anode wires is d, and a simple 
binary readout is used (i.e a wire records either a hit or a no-hit), then 
the resolution of reconstructed points on the charged-particle trajectory 
is d/V/12 (see Exercise 4.12). The separation between anode wires cannot 
be too small, because of the large electrostatic forces on the wires. The 
wires are held under tension to prevent neighbouring wires touching, but 
this imposes a minimum separation of a few millimetres. 

Drift chambers have better spatial resolution, down to about 50 um. 
A drift chamber in the barrel of a collider (head-on collisions) detector 
has a cylindrical structure. The anode wires are parallel to the chamber 
axis (parallel to the beam direction) and each wire is surrounded by 
cathode wires, creating a fundamental cell as sketched in Fig. 4.16. Such 
a cell might be several centimetres across, so the anode (or sense) wires 
are far apart from each other in comparison with an MWPC arrange- 
ment. This arrangement provides position measurements in the plane 
perpendicular to the beam axis. The trick is to measure the electrons’ 
drift time. Using a signal from a fast independent detector like a scintil- 
lator, measuring precisely when particles in colliding beams interacted 
producing charged particles crossing the drift chamber, one can measure 
the time between the primary ionization and the leading edge of a signal 
from an anode wire. 

Measurement of position along the beam direction can be done by dif- 
ferent techniques. One method is to use anode wires at a small angle to 
the beam direction, which allows ‘stereo’ reconstruction of the distance 
along the beam axis. The geometry of the fundamental cell as well as 
the gas composition need to be chosen carefully, so the drift velocity?9 
of electrons is as uniform as possible across the cell, allowing for precise 
measurement of the location where the primary ionization took place, 
calculating it from the drift time and the drift velocity.° In older experi- 
ments, the time of the signal was measured relative to an independent 
signal from a fast detector like a scintillator. At the LHC, an external 
timing detector is not necessary, because the LHC machine clock running 
at 40.008 MHz can be used. 


Signals and readout for wire chambers 


In this section, we will calculate the induced current in a cylindrical 
wire chamber. The electrons drifting towards the anode will create an 
avalanche very close to the anode wire. To a first approximation, we can 
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neglect the induced signal from the flow of electrons because they travel 
such a short distance. We can then calculate the induced current as the 
positive ions drift away from the wire to the cathode. 

We can easily calculate the current induced by the motion of a single 
ion in the simple cylindrical wire chamber using eqn 4.21. As this is 
a two-electrode geometry, we can read off the weighting field from the 
actual electric field by setting the voltage across the chamber to be 1 V 
and therefore, from eqn 4.27, 

E a 4.28 

|Ew| = In(b/a) r (4.28) 
The drift velocity of the ion, vq, is related to the electric field E by 
Va = HE, where p is the ion mobility. We will assume that the mobility 
is constant. If the number of electron-ion pairs created by the ava- 
lanche from a single primary electron is Niot, then the induced current 
(eqn 4.21) is’! 


[= —Ntote (4.29) 


va 
ln(b/a)r 


Substituting for the electric field for this geometry, we get the ion 
speed as 


dr LVo 
od at In(b/a) r (auo) 


multiplying both sides by r, we can integrate eqn 4.30 and solve for r: 
g \ 1/2 
r=a (1 + =) (4.31) 
to 


where to = a? In (b/a)/(2uVo). Substituting from eqn 4.30 into eqn 4.29, 
we get 

1 LVo 
In(b/a) r In(b/a)r 


I(t) = Ntote (4.32) 


Substituting for r from eqn 4.31 into eqn 4.32, we can calculate the 
induced current as a function of time: 


— Nio 1 
I(t) tote 


= Din(b/a) i+ by (423) 


This current flows up to the time (tmax) when the positive ions reach 
the anode. We calculate tmax from eqn 4.31 by setting r(tmax) = b: 


In(b/a) 
2uVo 


imax = (b° — a°) (4.34) 


Calculating to and tmax for typical conditions (see Exercise 4.11), we find 
to ~ 10ns and tmax ~ 100 us. This pulse shape is illustrated in Fig. 4.17. 


31The signal from a single electron is 
too small to measure. However, with 
the large gas amplification, the signal 
can be measured by a suitable low-noise 
amplifier. 
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32-These currents could otherwise flow 
from the high-voltage power supply to 
the ground, owing to occasional electric 
discharges (sparks), and might melt the 
wires (which are thin). 
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Fig. 4.17 Typical pulse shape from a 
cylindrical wire chamber. 


33The leakage current represents a 
noise source that if too large will swamp 
the signal. In addition, the leakage cur- 
rent will lead to local heating of the 
silicon, and it is difficult to remove this 
heat without adding excess material. 


34Thermal generation of electron-hole 
pairs will always occur, but the 
resulting leakage current is usually 
acceptable—if it is not, it can be re- 
duced by cooling the silicon. 


Fig. 4.18 Fundamental cell 
circuit. 


readout 


Equation 4.33 shows that the current pulse has a fast peak and then a 
very slow ‘tail’. For a high-rate application such as a collider detector, we 
need fast pulses. We can produce a fast pulse by suitable ‘pulse shaping’; 
this is done by filtering in frequency space to remove the low-frequency 
signals. A typical electronic readout circuit is sketched in Fig. 4.18. R is 
very large (~MQ), thus protecting the anode wires from large currents.°? 
R2C2 and RC; are time constants, small enough to allow a fast current 
flow through R2, the input resistance of a preamplifier connected to the 
anode wire, isolated from the high voltage by the C2 capacitor. 


4.6.2 Silicon detectors 


Silicon strip detectors as well as silicon pixel detectors are playing an 
increasingly important role in tracking. The operation of silicon detect- 
ors is based on the fact that silicon is a semiconductor with a bandgap 
of 1.1eV. A high-energy charged particle traversing silicon will inter- 
act with the silicon to create electron-hole pairs. However, most of the 
energy goes into phonons, so the average energy lost per electron-hole 
pair created is significantly larger, about 3.6eV. This results in about 80 
electron-hole pairs per micrometre for a minimum-ionizing particle. If 
no external field were applied, the electron-hole pairs would move apart 
slowly owing to diffusion; however, this process is too slow for most 
applications in particle detectors. Therefore, an electric field is applied 
to separate the electrons and holes. This motion of electrons and holes 
causes an induced current to flow in the external circuit as discussed in 
Section 4.4 

Even in high-purity, high-resistivity silicon, the presence of a strong 
electric field would result in an unacceptably large leakage current, i.e. 
current flowing even without the presence of the charged particle.” This 
problem is solved by making a pn junction, which forms a diode junc- 
tion. When a reverse bias is applied to the diode, the free electrons are 
removed from the n-doped region, creating a ‘depletion’ region, in which 
there is a very low density of free carriers, thus allowing a large electric 
field to be applied, without paying the price of the unwanted large leak- 
age current.?+ How thick does the silicon have to be to create a big 
enough signal? There is actually no correct answer to this question, be- 
cause it depends on the amplifier, but a typical choice is 300 um, which 
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results in a signal of about 25000 electron-hole pairs for a minimum- 
ionizing particle. The next question to consider is how large an electric 
field is needed to fully deplete the silicon. We can answer this ques- 
tion starting from Poisson’s equation for the potential V in terms of the 
charge density p and the dielectric constant e€: 


V?V = —p/e (4.35) 


If we are assuming an effectively one-dimensional diode, and N is the 
net volume number density of charges, we can use eqn 4.35 to calculate 
the potential as a function of the distance z: 


av | Ne 


ON = 4. 
oea (aan) 


where e is the electron charge and e is the permittivity of silicon. We 
will consider a detector with p strips in n bulk silicon. In this case, the p 
region is much more heavily doped than the n region, so we only need to 
consider the n-doped region. On applying the reverse bias, we remove all 
the free electrons from the n-doped region, which leaves behind a fixed 
space-charge density. Integrating eqn 4.36 gives?” 


= (£ — £n) (4.37) 


where Na is the donor (electron) density and x, is the limit of the 
depletion region. Integrating eqn 4.37 gives 


2 
vos (= z zan) (4.38) 


Finally, the total voltage applied across the depletion region is found by 
setting © = Zp: 


2 
Vbias = —— -= (4.39) 


Equation 4.39 shows why we need high-purity silicon to make good 
detectors—because impurities contribute to Ng and hence cause an in- 
crease in the bias voltage required for full depletion.2° With typical 
detector-grade silicon, a 300 m-thick silicon detector requires a bias 
voltage of about 50V (see Exercise 4.13) We can calculate the drift 
velocities for electrons and holes: 


Varift = HE (4.40) 


where u is the mobility and E is the electric field. We can use the 
measured mobilities to calculate the maximum drift times for electrons 
and holes (see Exercise 4.13). Detectors are typically operated at higher 
bias voltages to speed up the signal collection. Great care is needed in 
the design of silicon detectors, because too large electric fields can lead 


35 aV/dax is just equal to minus the 
electric field. 


36 Too high a bias voltage will result 
in electrical breakdown in the cables or 
the silicon detector itself. 
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37In normal operation, there is no 
equivalent of gas gain in silicon de- 
tectors, although devices like avalanche 
photodiodes can be operated at suffi- 
cient voltage for amplification to occur. 


38, Occurring every 25 ns under nominal 
LHC operating conditions. 


39 Electrical power consumption must 
be minimized: if more heat is generated, 
the cooling system must be larger, de- 
grading the tracker resolution and cre- 
ating unwanted secondary interactions 
in the tracker. 


40 Gy(Si) is the SI unit of dose, cor- 
responding to 1J of energy deposited 
per kg of Si: a dose of 100kGy(Si) 
corresponds to ~10° lung X-rays. 


414 useful rule of thumb is that the 
leakage current doubles for every 7K 
increase in temperature. 


42With sufficient damage, this can 
cause n-type silicon to change to p-type 
silicon, a process called type inversion. 
However, detectors can operate satis- 
factorily after type inversion. 


to electrons gaining enough energy to cause secondary ionization, which 
leads to an avalanche effect and hence results in electrical breakdown. 
This will start in the region of highest electric field; any very small-scale 
non-uniformities in the electrode structure can cause enhanced electric 
fields and hence lead to electrical breakdown, even at relatively low bias 
voltages. 

To determine whether this small signal’? can be detected, it is essential 
to consider all sources of electronic noise. This is discussed in detail in 
Spieler’s book in Further Reading, from which we see that we need to 
have low-capacitance detectors. For high-rate applications, such as the 
silicon trackers at the LHC, we need to minimize ‘pile-up’ backgrounds 
from hits in previous bunch crossings?’ generating spurious hits in the 
triggered bunch crossing. This implies that the ‘shaping time’ of the 
electronics should be not more than O(25 ns). The challenge is to design 
low-noise amplifiers that are sufficiently fast and consume low power.?? 


Radiation damage 


One of the difficulties with the application of silicon detectors in par- 
ticle physics, particularly at the LHC, is radiation damage. At a radius 
of 30 cm from the beam line, the expected ionizing dose over the detector 
lifetime is 100 kGy(Si).4° High-energy particles can displace silicon atoms 
from their lattice sites, creating complex defects that result in states be- 
tween the valence and conduction bands (called mid-bandgap states). 
This makes it much easier for thermal generation to promote an elec- 
tron from the valence to the conduction band. This greatly increases 
the leakage current. The leakage current is strongly dependent on the 
temperature T: 


E 
Deakx(T) = AT? exp (- ar) (4.41) 


where kp is Boltzmann’s constant and Eg is the bandgap, which for 
silicon is 1.1eV.4! Therefore, the leakage current can be very efficiently 
suppressed by cooling the silicon. These mid-bandgap states act like 
extra acceptors and thus change the effective dopant concentration.*? 
From eqn 4.39, we can see that an increase in the effective dopant con- 
centration will result in detectors requiring higher bias voltages to be 
fully depleted. The electrical breakdown of detectors at very high volt- 
ages therefore sets the scale for the maximum radiation doses that can 
be tolerated. In addition, some of the extra states can cause ‘charge 
trapping’, which looks like a signal loss. 


Silicon systems 


The spatial resolution of a silicon detector is largely determined by the 
segmentation of the silicon into individual detector channels. If the width 
of a detector segment is x, and if a particle only causes a hit in a single 
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channel, then the spatial resolution in this direction is 2/12. Improved 
resolution can be achieved by using signals in neighbouring channels; 
the amount of charge sharing with neighbouring channels gives extra 
information on the location of the ‘hit’. 

There are generally two classes of silicon detector systems: strips and 
pixels. 


Strip systems 


A very simplified schematic cross-section of part of a generic silicon 
strip detector is shown in Fig. 4.19. A positive high voltage is applied 
to the ‘backside’ via the Al contact, which depletes the n-bulk silicon. 
Electron—hole pairs created by ionizing particles drift in the electric field 
and the current induces signals on the readout electrodes. The signal 
electrodes are AC-coupled to the Al strips (using the SiO% as an insula- 
tor), which are then connected to the preamplifiers in the readout ASIC 
(application-specific integrated circuit). The noise increases with the de- 
tector capacitance (see Spieler in Further Reading for an explanation); 
therefore, for high-rate applications such as the LHC, we must minimize 
any stray capacitance between the detector and the amplifier. The con- 
nection is made with ‘wire bonds’, typically a few millimetres long and 
25 um-thick aluminium wire. These thin bond wires can be ultrasonically 
bonded to pads on the detector and on the readout ASIC. This allows 
a very short connection between the detector and the amplifier, which 
introduces much less capacitance than a longer wire cable. We also need 
to create a DC return path for the current and this requires a large- 
value resistor, so that the fast signal flows across the capacitor. This is 
achieved with polysilicon resistors inside the silicon detector itself. 

In a strip detector, the silicon wafer is divided into long narrow strips, 
with typical strip widths in the range of 50-100 um (the largest wafers 
used are 6 inches in diameter). This is done to obtain good precision in 
the bending plane of the magnetic field. Modest resolution (~1 mm) in 
the orthogonal direction is achieved by having half the sensors with a 
small stereo angle. This has the disadvantage that it creates ambiguities 
if more than one particle hits a sensor.*% 

The amplifiers are in custom-designed ASICs. As the time taken for 
the first-level trigger (L1) (see Section 4.10) is of the order of micro- 
seconds, which is much longer than the 25 ns between bunch crossings, 
the data must be kept on-detector until the trigger decision is made. 
This is achieved with ‘pipeline’ memory in which the data from each 
strip for each bunch crossing are stored in different memory elements 
(see Fig. 4.20). If the L1 rejects the event, the corresponding data can 
be overwritten. If the event is triggered at L1, the corresponding data 
are read out via optical links. 

A schematic view of an ATLAS Semi-Conductor Tracker (SCT) mod- 
ule is shown in Fig. 4.21. The module consists of two pairs of silicon 
wafers glued together to make a double-sided module. The ASICs are 
mounted on flexible copper—Kapton circuits. The beryllia (BeO) ‘ear’ 
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Fig. 4.19 Schematic cross-section 
through a silicon micro-strip detector 
with p implants in an n-bulk silicon. 
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Fig. 4.20 Principle of a pipelined 
memory. At each clock cycle, data are 
written into the cell defined by the 
write pointer. This pointer is advanced 
by one cell every clock cycle; after it 
gets to the last cell, it cycles back to 
the first. The read pointer follows a 
fixed number of clock cycles behind 
the write pointer. The time delay 
between the write and read pointers 
defines the time available for making a 
trigger decision. If the decision is 
positive, the data are read out from 
the corresponding cell; if not, then 
new data can be read into this cell. 
When the pointers advance beyond 
the last cell (12 in this unrealistic 
example), they cycle back to cell 1. 


431n the ATLAS case, a discriminator 
is used to determine if hits are above 
threshold, so the output data are digi- 
tal. In the CMS tracker, the signal 
amplitude is transmitted off-detector 
via analog optical links. 
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Fig. 4.21 Schematic view of an SCT 
module [6]. 
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at the side allows the module to make good thermal contact with the 
cooling tube. The coolant used is perfluoropropane (C3Fs), since this 
provides very efficient two-phase cooling; i.e. the heat from the ASICs 
and the silicon detectors is used to evaporate liquid C3Fs. These are very 
large systems, with 60m? of silicon detectors for ATLAS and 200 m? for 
CMS. The modules have to be held rigidly in place to benefit from the 
high intrinsic spatial resolution, but the material must be minimized 
because any material causes multiple scattering of all charged particles 
and results in electrons and photons starting electromagnetic showers 
before the calorimeter. Therefore, each module is mounted on carbon- 
fibre support structures since these provide the best ratio of stiffness to 
weight. 


Pixel systems 


In silicon pixel detectors, the silicon is divided into much smaller areas; 
for example, in the ATLAS pixel detector, the dimensions of individ- 
ual pixels are 50 um x 400 um. The smaller dimension is in the bending 
plane of the magnetic field to optimize the momentum resolution. The 
first advantage of pixel over strip detectors is that they provide unam- 
biguous high-precision space points. In addition, the ‘occupancies’ (i.e. 
the fractions of detector elements that are hit in given events) are much 
lower for pixel detectors than for strips. This is vital for pattern recog- 
nition at the LHC, which has to reconstruct tracks in the presence of 
‘pile-up’ background from about 25 collisions in the same bunch cross- 
ing. The small area of the pixels means that the detector capacitance 
is very low, which allows very low noise to be achieved (see Spieler in 
Further Reading). However, this requires minimization of stray capaci- 
tance between the silicon pixel and the amplifier in the ASIC. One of 
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the main difficulties with pixel systems is how to make the electrical 
connection from each silicon pixel to a unique channel of the readout 
ASIC without introducing any significant capacitance. This is achieved 
by ‘bump bonding’.*4 The much larger number of channels in pixel sys- 
tems than in strips requires more sophisticated data processing in the 
ASICs.*° Other system aspects for pixels are similar to those for strips. 

Pixel systems offer many performance advantages over strips, but as 
the electronics covers essentially the full sensitive area, a layer of pixel 
detector will have more material than an equivalent layer of strips. In 
addition, pixel detectors are significantly more expensive than strip de- 
tectors of the same dimensions. Therefore, LHC detector systems are a 
compromise, with pixels being used close to the beam pipe and strips 
being used further away. 


4.6.3 Tracker performance 


Consider the track of a charged particle with momentum p (measured 
in GeV) perpendicular to a magnetic field B. The radius of curvature 
R is related to the momentum by p = 0.3BqR. We assume that the 
track is measured over a length | (see Fig. 4.22). From the geometry, 
we can relate R to the ‘sagitta’ s and | by Pythagoras’ theorem: R? = 
(R — s)? + (1/2). For high-momentum tracks, we can neglect the s? 
term and find 1/R = 8s/l?. Therefore, the error in 1/R is given by 
o(1/R) = 86s/l?. To make approximate estimates of the momentum 
resolution, we will assume that the track is measured very precisely at 
the start and end of the trajectory but with an error given by 6s at the 
midpoint. In this approximation, 


86s 


1/p) = ———— 4.42 
a(1/P) = SSD (4.42) 
Although eqn 4.42 is a rough approximation, some general features 


are valid: 


(1) B field: The resolution improves with the value of B, so we wish 
to use the largest value possible. Using superconducting magnets, 
fields up to 4T have been achieved. 


(2) Length: The resolution improves as /?; however, for a tracker in 
a collider detector, the value of l is set by the inner radius of 
the electromagnetic calorimeter. Increasing l too far is therefore 
impractical for cost reasons—for the ATLAS inner detector, the 
track length is about 1m. For muon spectrometers, the constraints 
are weaker and large values of l can be used (e.g. in the ATLAS 
muon spectrometer, | ~ 5m). 


(3) Scaling: The resolution is constant in 1/p, which implies that the 
momentum resolution degrades with increasing momentum. 


If B and / are fixed and we wish to measure momenta up to some value 
Pmax, We can use eqn 4.42 to estimate the required spatial resolution 


441 this process, indium solder is de- 
posited on metallized pads on the pixel 
and heated in a reflow process to form 
hemispherical solder balls; the detector 
is then flipped and positioned very pre- 
cisely over a flexible circuit with the 
readout ASICs already mounted. A fur- 
ther reflow of the solder results in an 
electrical connection between the pixels 
and the amplifiers in the readout AS- 
ICs. It is difficult to achieve a high yield 
and this process is very expensive. 


45The area required in an ASIC to im- 
plement a pipeline for each pixel would 
be unacceptably large. Therefore, an- 
other approach is used for the pipeline 
that benefits from the very low occu- 
pancies in the pixels. A data-driven 
pipeline is used so that when a pixel 
is above threshold for a given bunch 
crossing, a ‘time stamp’ for that pixel 
address is written in memory. When a 
first-level trigger is received, the data 
for all the pixels with the correct time 
stamp are read out. 
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Fig. 4.22 Definition of the track 
sagitta s. 
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6 The optimization of the overall reso- 
lution is an interesting trade-off, be- 
cause adding more measurements will 
decrease D but will add more material 
and therefore increase C. 
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Primary vertex 


Fig. 4.23 Schematic view of tracks in 
the transverse plane showing tracks 
from the primary vertex and the 
definition of the impact parameter do 
from the one track resulting from a 
decay. 


47We cannot afford to instrument the 
entire calorimeter with the fine segmen- 
tation required to measure the electro- 
magnetic showers. 


(see Exercise 4.10). So far, we have only considered the contribution of 
the precision of the measurement points. However, in a real detector, 
we have material, so the charged particles undergo multiple scattering 
(eqn 4.6). As the scattering angle is inversely proportional to the mo- 
mentum, this causes a contribution to the error in p that is constant. 
Therefore, the momentum resolution of a real tracker can be param- 
eterized by adding the effects of measurement precision and multiple 
scattering in quadrature: 


a(1/p) =C ® D/p (4.43) 


where D is the term due to multiple scattering and C is the term due to 
measurement error resolution.*6 

Another important measure of the performance of a tracking detector 
at a collider is how precisely the tracks can be extrapolated back to the 
primary vertex. Particles originating from decays of b or c quarks or T 
leptons will travel for the order of 1 ps before decaying, and hence if 
one extrapolates the tracks back, they will miss the primary vertex. In 
the plane transverse to the beam direction, this distance is called the 
impact parameter (see Fig. 4.23). The resolution in impact parameter 
depends on the intrinsic resolution of the tracker and multiple scattering. 
Therefore, one requires a very high-precision measurement as close to 
the beam line as possible, and this is performed with silicon detectors 
(either strip or pixels). To minimize multiple scattering, one needs to 
have a very thin (in radiation lengths) beam pipe, and the best choice 
is beryllium (although beryllium is very difficult to machine and hence 
expensive). 


4.7 Detectors for particle jets 


The energies of particles and ‘jets of particles’ are measured in detector 
systems called ‘calorimeters’. Ideally, all particles with the exception of 
muons and neutrinos (or still to be discovered neutrino-like weakly inter- 
acting particles) should deposit all their energies in the calorimeter. As 
electromagnetic showers occupy much smaller volumes than hadronic 
showers (see Section 4.3), we require much finer segmentation for the 
front of the calorimeter than the back.4’ Therefore, the design of cal- 
orimeter systems is usually split into ‘electromagnetic calorimeters’ and 
‘hadronic calorimeters’. 


4.7.1 Electromagnetic calorimeter 


The depth of the electromagnetic calorimeter is chosen such that nearly 
all the energy of electromagnetic showers from electrons and photons of 
the interesting energy range is contained in this part of the calorimeter. 
This can be determined from Monte Carlo simulations such as those 
illustrated in Fig. 4.6. At LHC energies, we need to measure electrons 


and photons with energies of several hundred GeV; therefore, the elec- 
tromagnetic calorimeter needs to be about 25Xo deep. Finer longitudinal 
sampling will also help separate showers induced by electrons from those 
induced by hadrons. The lateral shower size is set by the Molière radius 
(see Section 4.3), which for lead is Rm = 1.8cm. The scale for the lat- 
eral size of hadronic showers is set by the hadronic interaction length Ar 
and is typically an order of magnitude larger. We can therefore achieve 
further separation between showers induced by electrons and hadrons 
with fine lateral and longitudinal segmentation. There are two different 
types of electromagnetic calorimeters: 


e ‘sandwich’ calorimeters with alternating layers of ‘active’ and 
‘passive’ material; 


e homogeneous calorimeters in which one material fulfils the function 
of absorber as well as actively detecting the presence of the shower. 


There are many trade-offs between these approaches. In a sandwich cal- 
orimeter, most of the energy is deposited in the passive layers and there 
are significant fluctuations in the fraction of the energy deposited in the 
active layers. This usually limits the resolution of sandwich calorimeters 
and the best resolution can be achieved with homogeneous calorimeters, 
for which this effect does not arise. However, the average density of crys- 
tals used in homogeneous calorimeters tends to be lower than that in 
sandwich calorimeters, which therefore increases the depth of the elec- 
tromagnetic calorimeter. This results in larger volumes for the hadronic 
calorimeter and the muon system, and will thus increase the cost. 


4.7.2 Homogeneous calorimeters 


Homogeneous calorimeters are usually based on scintillating crystals 
(Section 4.4.2). The CMS electromagnetic calorimeter (ECAL) is an 
example of this technique. At the LHC, the scintillation must be fast 
because of the short time between bunch crossings (25 ns). The crystals 
must have very good radiation tolerance in order to survive many years 
of LHC operation. Finally, the crystals must have a very high density in 
order to keep the dimensions small enough. The CMS electromagnetic 
calorimeter is based on lead tungstate (PbWO,) crystals with a dens- 
ity of 8.28gcm~? and a radiation length of 0.89cm. About 80% of the 
scintillation light is emitted in less than 25ns [62]. One challenge with 
this system is that the transparency of the crystals decreases with ra- 
diation, and therefore sophisticated monitoring techniques are required 
to compensate for these effects. In addition, the light output is very 
sensitive to temperature, so the temperature needs to be maintained 
at a constant value. Because photomultipliers cannot be used in very 
strong magnetic fields, the scintillation light is read out by avalanche 
photodiodes (APDs).** A photograph of one such crystal with the APD 
readout is shown in Fig. 4.24. 
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481 the end-cap calorimeter, the ra- 
diation levels are too large for the use 
of APDs, and vacuum phototriodes are 
used instead. 
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Fig. 4.24 Photograph of a PbWO4 
crystal and readout for the CMS elec- 
tromagnetic calorimeter [62]. 
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Fig. 4.25 Schematic view of one cell 
of a sandwich scintillator calorimeter 
with wavelength-shifting plates to 
guide the light to the photomultiplier 
at the back. 


4.7.3 Sandwich calorimeters 


In a sandwich calorimeter, there are alternating layers of active and 
passive material. The passive material should have high Z (to enable 
a relatively compact design)—lead is a common choice. The total en- 
ergy detected in the active layers is only a fraction of the total energy 
deposited. This fraction can be measured in prototypes or small parts 
of the calorimeter in dedicated test beams in which the energy of the 
incident electrons is fixed, although at the LHC the rate of Z produc- 
tion is so high that ‘in situ’ calibration can be performed. Before the 
LHC, the most common design of electromagnetic sandwich calorimeter 
used plastic scintillators for the active layers. The scintillation light (see 
Section 4.4.2) needs to be guided to the photomultipliers at the back 
of the calorimeter. This is done using ‘wavelength-shifting’ plates (see 
Fig. 4.25). These contain fluors to shift the wavelength to longer wave- 
lengths (typically in the green), for which the plastic is more transparent. 
There are several limitations to this technique: 


e Cracks: Each tower requires a support structure to hold it in place 
which introduces dead zones between cells (called ‘cracks’). 


e Non-uniformity: The absorption length for the scintillation light 
is typically the same magnitude as the lateral dimensions of the 
cell; therefore, the response will depend on the impact point of the 
electron. 


e Radiation damage: The scintillator will suffer significant radi- 
ation damage, and very good calibration schemes are essential to 
track this. The most common method used is to move a radioactive 
source such as ®°Co over the calorimeter. 


A newer approach to scintillator sandwich calorimeters uses wavelength- 
shifting fibres embedded in the scintillator to transport the light to the 
photodetectors. This avoids the need for bulky waveguides, which add 
to the ‘cracks’ between calorimeter cells. 


To overcome these limitations, a novel type of electromagnetic calor- 
imeter has been developed for ATLAS, based on a new geometry for lead 
absorbers and liquid-argon ionization chambers. The signals are gener- 
ated by electrons created by ionization, drifting in a large electric field 
and generating an induced current at the electrodes (see Section 4.4).*° 
The fundamental problem with this technique for use at the LHC is that 
typical drift times for the electrons are ~400 ns, which is much longer 
than the time between bunch crossings of 25 ns. The solution is based on 
very fast ‘bipolar’ pulse-shaping electronics, in which most of the signal 
is not detected but a suitably fast pulse is generated. This is illustrated 
in Fig. 4.26. As most of the signal is not utilized, it is essential to lower 
the noise in order to maintain the signal-to-noise ratio. This is achieved 
by lowering the capacitance and inductance of the electrodes using a 
novel ‘accordion’ geometry as shown in Fig. 4.27. An important advan- 
tage of this technique is that liquid argon is inherently radiation-hard. 
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49 This is very similar to a wire chamber 
operating in the ‘ionization chamber’ 
region, in which there is no gas gain. 
However, as liquids are much denser 
than gases, a high-energy charged par- 
ticle can create sufficient ionization for 
the signal to be detectable. 
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Fig. 4.26 Signal pulse shape in the 
ATLAS liquid-argon calorimeter [20]. 
The triangular shape is the current 
pulse created by the electron drift. 
The curve shows the pulse shape after 
shaping with a bipolar pulse shaper. 


Fig. 4.27 Sketch of a small section of 
a prototype for the ATLAS electro- 
magnetic calorimeter, illustrating the 
‘accordion’ structure [38] (all dimen- 
sions are in millimetres). In this geom- 
etry, the signals are transported to 
the electronics on flat copper/Kapton 
tapes, which have lower capacitance 
and inductance per unit length than 
the cables that would be required if the 
electrodes were orthogonal to the dir- 
ection of incidence of particles. The ab- 
sorber plates are made from lead lined 
with stainless steel. The liquid argon is 
contained between the absorber plates, 
and the copper/Kapton electrodes are 
attached to these plates. 
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50-There are many interesting trade-offs 
here. If the scintillator/passive ratio is 
increased, sampling fluctuations are re- 
duced, but the size and cost of the 
calorimeter are increased. As discussed 
in Sections 4.7.1 and 4.7.2, there is no 
perfect design. 


51At the LHC, the very large sample of 
Z — ete” events provide ample data 
for in situ calibration of the electromag- 
netic calorimeters. 


ATLAS CMS 
a (% Gev'/?) 10.0 2.8 
b (GeV) 0.4 0.12 
c (%) = 0.3 


Table 4.2 Electromagnetic calorimeter resolution for prototype calorimeters meas- 
ured in test beams. 


It is relatively easy to divide the readout cells to the desired lateral and 
longitudinal granularity. Another critical advantage is that the struc- 
ture is self-supporting, so there is no need for passive material between 
cells, thus avoiding the cracks inherent in calorimeters based on plastic 
scintillators for the active layers. 


4.7.4 Resolution 


The energy resolution of a typical electromagnetic calorimeter can be 
parameterized as 


AE a b 

E =o n" (4.44) 
where a, b, and c are constants and the different terms are added in 
quadrature. The constant a represents the ‘stochastic term’, b represents 
the contribution from electronic noise, and c is a constant term. In a 
calorimeter using a scintillator, if at a given energy the mean number 
of detected photons is N, there will be Poisson fluctuations giving a 
contribution to the stochastic term 


AE AN VN 1 AE a 
E N N „N E VE 


However, in a sandwich calorimeter, this effect is usually negligible 
compared with the ‘sampling’ fluctuations, i.e. the fraction of energy 
deposited in the active layers.°° The constant b in eqn 4.44 represents 
the contributions from electronic noise and should be negligible at high 
energies. The constant term c represents the effects of residual non- 
uniformities in response across the cell and over all cells, as well as 
variations in time. With the aid of good calibration procedures, the con- 
stant term can be reduced to less than 1%.5t Measured parameters from 
test-beam studies of the ATLAS [20] and CMS [62] electromagnetic cal- 
orimeters are given in Table 4.2. However during LHC operation, there 
are other factors that will degrade the resolution, such as radiation dam- 
age, uncertainties in the calibration constants, and ‘pile-up’ backgrounds 
(particles from extra collisions in the same bunch crossing). For the very 


important Higgs decay, H — yy, the precision of the angular meas- 
urement also contributes to the mass resolution. These factors favour 
the higher granular segmentation, the intrinsic stability, and the radi- 
ation hardness of a liquid-argon calorimeter compared with a scintillator 
calorimeter. The result is that the mass resolution for the Higgs decay 
H — yy is comparable for ATLAS and CMS. 


4.7.5 Hadronic calorimeter 


The hadronic calorimeter surrounds the electromagnetic calorimeter. 
Ideally, the combined electromagnetic and hadronic calorimeters should 
contain nearly all the energy from showers from hadrons entering the 
calorimeters (mostly 7+). An indication of the required depth of the 
calorimeter can be deduced from the curves in Fig. 4.8. The practical 
depths for hadronic calorimeters are constrained by cost and available 
space, but a rule of thumb is that at LHC energies a depth of at least 
about 10 nuclear interaction lengths is required. A homogeneous had- 
ronic calorimeter would be too large and so is not a practical option, 
and the hadronic calorimeter will be of the ‘sandwich’ type. The reso- 
lution for hadronic calorimeters is greatly reduced if the calorimeter 
is not ‘compensating’ , which means that the ratio of the response to 
electrons to that of hadrons, e/h, is significantly different from unity 
(see Section 4.3.6). There are several possible approaches to achieving 
compensation in hadronic calorimeters: 


e Tuning the ratio of absorber to active thickness: The en- 
ergy loss for electrons scales as Z?, compared with Z for charged 
hadrons; therefore, in thicker absorbers, the value of e/h can be 
lowered (however, this has other problems that will be discussed 
later). Lower-Z cladding can be used to absorb low-energy photons 
preferentially, which also reduces e/h. 


e Increasing the hadronic response: Instead of trying to sup- 
press the response to electrons, we can try to enhance the response 
to hadrons. There are many low-energy neutrons that can be indir- 
ectly detected by elastic scattering off nuclei. The optimal nucleus 
is hydrogen, so detectors containing hydrogen, such as organic 
scintillators, can be used. 


e Use of depleted uranium: One suggestion to increase the had- 
ronic response was to have uranium absorber plates and use the 
energy released by fission after fast-neutron capture.°? 


e Software compensation: In a finely grained calorimeter, calibra- 
tion procedures can be optimized to try to achieve compensation; 
this approach is discussed below. 

e Dual readout: The idea is to read out the shower energy using 
two different techniques with very different values for e/h. A proto- 
type of such a hadronic calorimeter has been built by the DREAM 
collaboration and it uses copper tubes, each filled with scintillator 
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52-This was the motivation for the use 
of uranium in the ZEUS calorimeter, 
which achieved compensation. How- 
ever, the two first items were more 
important than fission. 
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53The ATLAS barrel calorimeter has 
e/h x 1.4. 


54The correction factors also depend on 
the energy as well as the local energy 
density. 


and quartz fibres. The signal from the scintillator (S) and the 
Cerenkov (C) radiation in the quartz fibres are measured separ- 
ately. The values of e/h are very different for the S and C signals, 
which enables determination of the electromagnetic fraction fem 
for individual showers. The effect of e/h being different from unity 
can therefore be corrected, effectively achieving the good hadronic 
resolution of compensating calorimeters. 


Although compensating calorimeters have been built, there are 
disadvantages in cost and/or resolution for electrons and photons, and 
the calorimeters for the LHC experiments are not compensating.°® In a 
highly segmented calorimeter such as that used by ATLAS, the hadronic 
resolution can be improved by ‘software compensation’; the secondary 
electromagnetic showers are smaller than hadronic showers, so they lead 
to higher energy density in the calorimeter cells. Therefore, the electron 
response can be decreased by de-weighting cells with large energy, thus 
making the response closer to being compensating and thereby improv- 
ing the resolution. If the calorimeter cells are calibrated using electrons, 
the naive estimate of the energy in a hadronic shower would be given 
by E = 50, E;, where F; is the energy in the ith calorimeter cell. As 
electromagnetic showers are more compact, the cells with higher local 
energy density will probably have arisen from electromagnetic showers. 
A correction factor is applied for hadronic showers. The correction factor 
decreases for showers with higher local density of energy deposition.*+ 
The calibration procedure used to determine the calibration factors aims 
to reconstruct the true energy on average and to optimize the resolution. 

The resolution for hadronic calorimeters can be parameterized by the 
same form as for electromagnetic calorimeters (eqn 4.44). The stochas- 
tic term will be larger because of the relatively coarse sampling, and if 
the calorimeter is non-compensating, then there will be a large constant 
term, which will dominate the resolution at high energies. If the calorim- 
eter is not sufficiently deep, the energy lost at the back of the calorimeter 
will also contribute to the constant term. Any crack regions between cells 
or non-uniformity of the response over a cell will also add to the con- 
stant term. Typical examples of hadronic resolution for compensating 
and non-compensating calorimeters are given in Table 4.3. The superior 


ATLAS tile calorimeter ZEUS 
a (% GeV'/? ) 52 35 
b (GeV) 1.6 = 
c (%) 3.0 = 


Table 4.3 Energy resolution for prototype hadronic calorimeters measured in test 
beams. 
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resolution of the compensating ZEUS calorimeter [51] compared with the 
non-compensating ATLAS scintillating tile calorimeter [20] is clear, but 
even so the resolution is far inferior to that achieved by electromagnetic 
calorimeters. However, the compensation achieved in the ZEUS calorim- 
eter came at the price of degrading the electromagnetic resolution. So, 
as is usual in detector physics, there is no perfect answer and designs 
must be optimized to the requirements of a particular experiment. 


4.8 Detectors for particle identification 


In this section, we review some detector techniques for particle identi- 
fication. Some particle identification is performed by combining signals 
in different types of detectors,” but here we restrict ourselves to types °°For example, a high-momentum 


of detectors that give standalone particle identification. track that is matched to an electromag- 
netic shower in a calorimeter can be 


identified as an electron. 


4.8.1 Particle identification with Cerenkov 
detection 


The are two practical applications of Cerenkov radiation for particle 
identification: 


e Threshold counter: If we measure the momentum p of a charged 
particle, we can determine its speed depending on what particle 
type it is (and hence what mass it has). For some range of momen- 
tum, we can arrange that v > 1/n for one type of particle (e.g. 

mt) but is below threshold for another (e.g. K=). Therefore, we 
can separate 7 from K* depending on whether a Cerenkov signal 
was detected. 

e Ring imaging Cerenkov (RICH): A RICH detector represents 
a more sophisticated use of Cerenkov radiation in which we meas- 
ure the direction of Cerenkov photons. This requires optics to focus 
the photons of a given angle to a particular location on the pho- 
ton detector. We then associate particular Cerenkov photons with 
particular charged particles and fit a ring (hence the name of the 
technique) and measure the Cerenkov angle. If we know the refract- 
ive index of the medium, we can then determine the speed of the 
charged particle. Knowing the momentum p from an independent 
detector then allows us to estimate the mass of the charged par- 
ticle and hence identify it as a pion, kaon, etc We will look at an 
example of a RICH detector in Chapter 10. 


4.8.2 Particle identification with transition 
radiation 


We have seen that charged particles crossing a boundary between two 
dielectric layers can emit X-rays. As the transition radiation increases 
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This is not a problem for fixed- 
target experiments; however, it is very 
problematic for collider detectors, for 
which the radial space for the tracker 
is limited by the inner radius of the 
calorimeter. 


57The problem is that there are large 
statistical fluctuations in the magni- 
tude of the energy loss deposited by 
ionization in a short path length in a 
gas. 


with the Lorentz y factor, for practical purposes the yield is only sig- 
nificant for high-energy electrons, and this therefore provides a method 
to separate high-energy electrons from charged hadrons. As the photon 
yield per dielectric boundary is so low, we need many such boundaries. 
This sets a lower limit on the required length for a useful transition- 
radiation detector.5 The transition-radiation photons are in the X-ray 
region. These X-ray photons can be detected in wire chambers with a 
large fraction of a heavy noble gas like xenon. Xenon has Z = 54, which 
results in a large absorption cross section for X-rays, thus increasing 
the probability of X-ray absorption in a thin layer of gas. The energy 
deposited by X-rays is larger than the typical energy deposited by ion- 
ization in the gas, so a suitable discriminator level can be set that is 
sensitive to the X-rays from transition radiation but is rather insensitive 
to ionization." 


4.8.3 Particle identification with ionization 


We saw that the rate of energy loss by ionization depends on the speed 3 
of the particle (see eqn 4.5). Therefore, if we can make a suitable precise 
measurement of the energy loss by ionization and the momentum of a 
particle, we can achieve some separation between particles with different 
masses (e.g. pions and kaons). The momentum can be measured by a 
tracking detector in a magnetic spectrometer, and the amplitude of the 
signals in the elements of the tracking detector provide a measurement 
of the energy loss by ionization. The first difficulty with this technique 
is the presence of very large fluctuations in energy loss by ionization in 
thin layers, so if a wire chamber is used, a very large number of samples 
is required to achieve useful particle identification. The second problem 
is that the rate of energy loss as a function of momentum ‘plateaus’ at 
high momentum, so this technique is only useful at lower energies. 


4.9 Magnetic fields 


We need magnetic fields for trackers and muon spectrometers in order 
to use the measured trajectory to reconstruct the momenta. The mag- 
nets are usually based on the same NiTi superconducting technology 
discussed for accelerators in Section 3.3. The volumes of the magnets 
are very much larger and, although the magnetic fields are smaller, the 
energy stored in these fields is very much greater, which leads to new 
engineering challenges. 


4.9.1 Magnetic fields for trackers 


The usual choice of field configuration for trackers at colliders is a 
solenoid (with the axis along the beam line). To minimize the vol- 
ume and cost, one option is to place the solenoid between the tracker 


and the calorimeter. Clearly, too much ‘passive’ material upstream 
of the calorimeter will degrade the resolution of the electromagnetic 
calorimeter. Therefore, the fields are generated using superconducting 
magnets, with field strengths up to 2T being typical. The CMS mag- 
net has a field strength of 4T and has a larger radius, so the entire 
calorimeter system is housed inside the solenoid. 


4.9.2 Magnetic fields for muon spectrometers 


One option for the magnetic field for the muon spectrometer is to use 
magnetized iron. If there is a superconducting solenoid for the tracker, 
the magnetic flux will return from the solenoid through the iron sur- 
rounding the solenoid. In this case, the iron serves multiple purposes: it 
can be the passive absorber for the hadron calorimeter and act as shield- 
ing to remove particles other than muons before they reach the muon 
chambers, as well as acting as the return ‘yoke’ for the solenoid. The iron 
is instrumented with tracking chambers (a variety of wire chambers) and 
the reconstructed muon tracks in these chambers can therefore be used 
to determine the momenta. The momentum resolution for these tracks is 
limited by multiple scattering to about 10%. In the CMS approach, the 
muon spectrometer tracks are linked to the much more precisely meas- 
ured tracks in the tracker and hence a good muon momentum resolution 
can be achieved.*® 

In the approach used for ATLAS, the magnetic field for the muon 
spectrometer is generated by eight large superconducting toroids in the 
central (‘barrel’) region and eight smaller superconducting coils in each 
end cap (see Fig. 4.28). The average magnetic field in the tracking volume 
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585ee Exercise 4.9 for a discussion of 
some issues associated with muon trig- 
gers for this configuration. 


Fig. 4.28 Schematic view of the AT- 
LAS toroid coils [20]. The eight barrel 
toroid coils with the interleaved end- 
cap coils are shown. The cylinder shows 
the return flux for the solenoid. The 
length is 25.1m and the outer diameter 
is 20.1 m. 
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59The total energy stored in the AT- 
LAS magnetic fields is about 1.6GJ, 
which is the same magnitude as the kin- 
etic energy in a TGV train with a mass 
of 385+ travelling at 330kmh7!. 


60The most precise muon measurement 
is then obtained by combining the esti- 
mates from the tracker and the muon 
spectrometer. 


is in the range ~0.5-1 T, but very good resolution is achieved by track- 
ing over a long length | ~ 5m.°® Since most of the volume is air, the 
momentum resolution is not so limited by multiple scattering as for 
magnetized iron. Another advantage of this field configuration is that 
it allows reconstruction of precise muon momenta independently of the 
tracker.©° 


4.10 Trigger 


The trigger is an electronic and software system operating in ‘real’ time 
to reduce the raw data rate to a level that can be permanently stored. 
The trigger should keep as much of the interesting physics while rejecting 
the maximal amount of background events. The aim is to bring the rate 
down from the raw interaction rate to the maximum at which data 
can be kept in permanent storage, while retaining as large a fraction of 
the signal events as possible. Traditionally, this rate was typically of the 
order of a Hz, but advances in computer technology now allow far higher 
rates. The event rates are very different for different colliders. At et e7 
colliders, the rates are relatively low, of the order of a Hz, but the rates 
at hadron colliders have been increasing. At the LHC, there are multiple 
interactions per bunch crossing (50 ns in 2012 running and 25 ns for the 
nominal LHC operation) and the trigger reduces this rate to a level of 
the order of 500 Hz. 
Typically, there are three trigger stages or levels: 


e In the older generation of experiments, at the first, fastest, level, 
the selection is based on the timing and the signal level of de- 
tector components. The implementation is usually in fast hardware 
logic operations on outputs from units like comparators and coin- 
cidence counters. Detector signals are required to be in coincidence 
with colliding-beam bunches and to be compatible with tracks 
and energy deposits of particles coming from a small region where 
colliding-beam bunches overlap. More sophisticated algorithms are 
required for the LHC (see Section 4.10.1). 


e At the second level, fast processors are used to reject background 
events, like those coming from cosmic rays or stray accelerator 
particles in a halo around the beam pipe or from beam particles 
interacting with molecules and atoms in the residual gas in the 
beam pipe. At this level, we also need to reject genuine, but 
uninteresting, physics events produced by colliding-beam particles. 


e The third level is often comparable to the offline reconstruction. 
Farms of computers select signal events to be stored for off- 
line reconstruction and analysis. The main difference between the 
third-level trigger programs and offline programs is in the use of 
calibration constants and correction procedures, which need to be 
obtained or developed separately offline. 
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4.10.1 LHC triggers 


The issue of efficiently triggering on interesting physics events, while 
maintaining a manageable readout rate, is one of the main challenges 
for LHC detectors. At design luminosity, the rate of pp collisions is about 
1 GHz, and this rate has to be reduced to the order of 500 Hz for data 
to be stored for subsequent offline analysis. The first-level trigger (L1) 
uses signals from the full detector, which, given the finite speed of light, 
makes it impossible to generate a trigger decision from one bunch cross- 
ing before the following bunch crossing occurs (25ns at nominal LHC 
operation). This apparently insoluble problem is solved with the aid 
of a ‘pipelined’ system.°! The data are stored on detector in ‘pipe- 
line’ memory (see Fig. 4.20), while the L1 decision is being made by 
a custom hardware processor. In such a pipelined processor, one step of 
the trigger process operates on the data for a particular event in one 
clock cycle and then the next step is operated in the following clock 
cycle. The number of allowed steps for such a processor depends on the 
depth of the pipeline memory in which the data are stored.®? As all 
bunch crossings have genuine pp collisions, it is no longer sufficient 
to simply reject non-beam backgrounds—the L1 trigger must decide 
which real events to keep. The L1 trigger uses interesting signatures like 
high-transverse-momentum electrons by performing hardware sums of 
the energy deposited in neighbouring cells in the electromagnetic cal- 
orimeter. A global L1 trigger decision is made on the basis of several 
signatures (high-transverse-momentum muon candidates, large missing 
transverse energy, etc.). This L1 trigger typically reduces the rate to 
the order of 100kHz. At this rate, it is now feasible to read out all the 
data corresponding to triggered bunch crossings? and for the data to 
be processed by very large computer farms, which use the full detector 
granularity to reduce the rate to the required order of 500 Hz for storage. 


4.11 Examples of detector systems 


Now that we have seen the principles behind the design of detector 
subsystems, we can start to understand how these principle are applied in 
the designs of real detectors. We first look at collider detectors and then 
briefly consider neutrino detectors. Dark matter detectors are described 
in Section 13.7.2. 


4.11.1 Collider detectors 


We will take the ATLAS and CMS detectors as examples of collider de- 
tectors.°* The ATLAS detector is illustrated schematically in Fig. 4.29. 
The tracker is immersed in a 2T magnetic field and consists of silicon 
detectors closest to the beam line and a Transition Radiation Tracker 
(TRT) at larger radius. The silicon detector contains three layers of 


61 This approach was pioneered by the 
H1 and ZEUS experiments at DESY for 
the HERA collider. 


629 typical pipeline depth of 132 cor- 
responds to a time of 3.2 us, which is 
sufficient to allow the signals to reach 
the trigger processor, for a trigger deci- 
sion to be made, and for that decision 
to be fed back to the electronics on the 
detector. 


63The readout is performed using op- 
tical fibre links. 


64We cover some unique aspects of the 
LHCb detector in Section 10.7.3. 
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Fig. 4.29 Schematic 
ATLAS detector [20]. 


view 


of 


the 


pixels closest to the beam pipe to provide the best resolution for the im- 
pact parameter and layers of silicon strips at larger radius. The TRT is 
made from cylindrical ‘straw’ tubes, with each tube working as an inde- 
pendent cylindrical drift chamber. The tubes are interleaved with Mylar 
foils to generate transition radiation to enhance electron identification. 
The electromagnetic calorimeter is based on the liquid-argon accordion 
calorimeter (see Section 4.7.3). In the central region, the hadronic cal- 
orimeter uses an iron-scintillator sandwich design. The light from the 
scintillators is coupled to the photomultipliers using wavelength-shifting 
fibres. The novel feature of this design is that the steel absorber plates are 
rotated by 90° compared with the conventional design in which the plates 
are perpendicular to the direction of incidence of primary particles. This 
has the advantage that the calorimeter cells are self-supporting, thus 
avoiding ‘dead’ material between cells. Although the calorimeter sys- 
tem is not compensating, the fine granularity allows the use of software 
compensation to improve the resolution. Calorimeters extend up to pseu- 
dorapidity 7 ~ 5 in order to reconstruct missing transverse momentum 
(see Chapter 8). The muon spectrometer uses the toroidal coils dis- 
cussed in Section 4.9.2. In the central barrel region, the muon tracks are 
measured using detectors based on drift tubes. However, the signals are 
too slow to participate in the first-level trigger (see Section 4.10) and 
therefore faster but lower-resolution detectors are also used. 

A schematic view of the CMS detector is shown in Fig. 4.30. There 
is a very large all-silicon tracker consisting of three layers of pixel 
detector and 10 layers of strip detectors immersed in the 4T solen- 
oidal magnetic field, which provides very good momentum resolution for 
charged particles. The electromagnetic calorimeter uses PbWO, crystals 
(see Section 4.7.2). The hadronic calorimeter uses a brass—scintillator 
sandwich calorimeter. As with ATLAS, forward calorimeters extend the 
coverage to close to the beam pipe. The muon chambers are interleaved 
with the return yoke of the solenoid. They are used for the first-level 
muon trigger, but high-precision measurements of muon momentum are 
made in the tracker. 
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Superconducting solenoid 
4 Silicon tracker 
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vA „Pixel detector 


calorimeter 


calorimeter P 
Electromagnetic ,; 
calorimeter 


detectors 


4.11.2 Neutrino detectors 


Optimization of neutrino detectors is very different to that of collider 
detectors because the very small cross sections mean that very massive 
detectors are needed to allow useful event rates to be obtained.® Given 
the sizes involved, we are obliged to use cheaper detector technologies 
than at hadron colliders. The requirements depend on the neutrino en- 
ergies. For an accelerator neutrino experiment, a typical requirement is 
to have a very large target mass and be able to measure the following: 


e Electrons: We need to measure electrons from neutral-current 
scattering (see Chapter 7) off electrons vy;e~ — v,e~ (where Vy is 
any flavour of neutrino), or from similar processes with scattering 
on the nuclei. 


e Muons: We have muons from charged-current interactions v,N —> 
pu N’X, where N and N’ are the target and scattered nu- 
clei, respectively, and X represents any hadrons produced in the 
interaction. 


e Hadrons: For neutral-current scattering off nuclei, the only par- 
ticles we can measure are the outgoing hadrons. Measurements 
of produced hadrons also improve the determination of the event 
kinematics for charged-current interactions. 


In general, we can use calorimeters to measure electrons and hadrons. 
If the passive absorber plates are made from magnetized iron and we 
instrument the gaps between absorbers with some tracking detector, we 
can determine the tracks caused by muons. We can then identify muons 
as particles that penetrate deeper into the detector than hadrons and 
at the same time we can estimate the momentum by measuring the 
curvature of the tracks. We will see how these principles are applied in 
practice in the MINOS far detector in Chapter 11. 


Fig. 4.30 Schematic view of the CMS 
detector [62]. 


65 This is particularly true for neutrino 
detectors in laboratory oscillation ex- 
periments, in which we need a detector 
far from the neutrino source to study 
oscillations (see Chapter 11). 
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Chapter summary 


e The physics of interactions of high-energy particles in matter has been 


reviewed. 


e The basic detector physics of how a signal is generated by charged 
particles has been explained. 


e A brief summary of how different types of scintillators work has been 


given. 


e The basic concepts of trigger systems have been explained. 


e Different detector systems have been discussed, and it has been described 
how they are combined into a general purpose detector. 


e Further case studies of real particle physics detectors are given in other 


chapters. 


Further reading 


Particle Data Group (2014). Review of Particle Phys- 
ics. Chin. Phys. C, 38, 090001. In the section ‘Ex- 
perimental methods and colliders’, the review article 
‘Passage of particles through matter’ gives a thor- 
ough discussion. The review article ‘Particle detectors 
at accelerators’ gives a more advanced and thorough 
discussion than is given in the present chapter. 


Green D. (Ed.) (2010). At the Leading Edge: The AT- 
LAS and CMS LHC Experiments. World Scientific. 
This is a collection of advanced review articles on 
different aspects of the detectors. 


Grupen, C. and Shwartz, B. (2008). Particle Detect- 
ors (2nd edn). Cambridge University Press. This gives 


a very comprehensive description of many detector 
technologies. 


Kleinknecht, K. (1998). Detectors for Particle Radi- 
ation (2nd edn). Cambridge University Press. This 
gives a short and clear introduction to detector physics. 


Blum, W., Riegler, W., and Rolandi, L. (2008). Particle 
Detection with Drift Chambers (2nd edn). Springer. 
This is the definitive advanced textbook on this sub- 
ject. 


Spieler, H. (2005). Semiconductor Detector Systems. 
Oxford University Press. This is a very good ad- 
vanced textbook on silicon detectors and the associated 
electronics. 


Exercises 


(4.1) Starting from eqn 9.12 in Chapter 9 and using a 
change of variable, derive eqn 4.1. 


(4.2) Consider elastic scattering of a heavy particle of 
mass M with speed £ on a stationary electron. 


(a) Let the kinetic energy of the scattered electron 
be T in the frame in which the electron was 
initially at rest. Show that the 4-momentum 
transfer evaluated in this frame is Q? = MeT. 


(b) Assuming m/M « 1 and ym/M <« 1, show 
that the maximum kinetic energy of the elec- 
tron after the scattering in the lab frame is 
Tress 277 Bme. 


Hint: Consider the problem in the CMS and then 
use a Lorentz transformation from the CMS to 
the lab. 


(4.3) 


(4.4) 


(4.5) 


Calculate the Fourier coefficients for the periodic 
function defined by f(z) = z/L for 0 < z < L and 
f(z) = 0 for -L < z < 0 (the function repeats 
periodically). Use this result to derive eqn 4.17 
starting from eqn 4.16. 


(a) A very crude model of the initial develop- 
ment of an electromagnetic shower is that a 
high-energy electron or positron of energy Eo 
undergoes a bremsstrahlung process after a 
distance L (one radiation length) and loses half 
of its energy to a secondary photon, or that a 
high-energy photon initiates a pair production 
process after travelling a distance L, splitting 
its energy equally between the two secondary 
particles. These processes continue until the 
photons and charged particles each have an en- 
ergy less than the critical energy E.(« Eo), at 
which point the multiplication ceases. Develop 
this model and answer the following questions 
both for an incident electron of energy Eo and 
for an incident photon of the same energy: 

(i) How many photons and charged particles 
will there be after N radiation lengths? 


(ii) What is the energy of each particle in the 
shower after N radiation lengths? 


(iii) What is the depth (in units of L) at which 
the number of particles in the shower is 
a maximum, and what is the number of 
particles at maximum? 


Compute the depth and the number of par- 
ticles when multiplication ceases for a 4GeV 
electron entering lead glass (L = 2.5 cm, Ee = 
10 MeV). 


Explaining any assumptions you make, how 
would the resolution of an electromagnetic cal- 
orimeter scale with the energy of the incident 
electron, Æ? 


Calculate the direction of Cerenkov radiation with 
respect to the direction of motion of fast charged 
particles in water. The refractive index of water 
is 1.33. 


Calculate the threshold energy above which elec- 
trons and muons emit Cerenkov radiation in water 
(refractive index = 1.33). What consequences does 
this have for the measurement of 


(a) the solar neutrino flux? 


(b) the flavour ratio of atmospheric neutrinos? 


(4.7) 


(4.8) 


(4.9) 
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A very simple model of a high-precision silicon 
‘micro-vertex detector’ (MVD) consists of two con- 
centric cylindrical layers surrounding the beam 
line. The first layer is at radius Ro = 5cm, and 
the separation between the first and second layers 
is L = 2cm. The intrinsic measurement resolution 
of a hit is ø = 10 um in the Rọ direction (roughly 
orthogonal to the trajectory of a particle with large 
transverse momentum). 

Show that (neglecting multiple scattering) the 
uncertainty in the impact parameter (distance of 
closest approach to the beam line in the plane 
perpendicular to the beam), oa, is given by 


(Ro + L) + RE 


Oa = 


L 


and calculate it for the parameter values given 
above. How does oq change if (i) L is doubled; 
(ii) Ro is increased to 8cm? What factors limit 
the ability to decrease R or increase L. Assume 
that each layer has a thickness of 2% of a radiation 
length. How does multiple scattering affect the im- 
pact parameter resolution? For what momentum 
would the uncertainty in the impact parameter 
from measurement error be equal to that from 
multiple scattering? 


Consider a cylindrical detector immersed in a uni- 
form solenoidal magnetic field B. Let R be the 
radius of curvature of a track in the plane trans- 
verse to the beam line (measured in metres). 
Show that the transverse momentum pr = 0.3BR. 
A very simplified model for the resolution of a 
tracker assumes that the track is precisely located 
at the start and end of the trajectory but there is 
a measurement error in the transverse plane of os 
at a radius of half the outer radius of the tracker 
(L/2). Using this model, determine the transverse 
momentum resolution as a function of pr. For such 
a detector with B = 4T, L = 1 m, and gs = 10 um, 
estimate the largest value of pr that could be 
measured with an error less than one-third of the 


value. 


Consider a solenoid providing a uniform magnetic 
field B = B,z for a radius 0 < R < Rı. All the flux 
returns through a return yoke such that B = B22 
for a radius Rı < R < Rə. 


(a) Show that pr B(r)dr = 0. 
(b) What is the force on a charged particle mov- 


ing with a velocity v in the (x, y) plane? Hence 
find the torque on the charged particle. 
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(4.10) 


(4.11) 


(4.12) 


(4.13) 


(c) Now consider the trajectory of a muon created 
on the axis of the solenoid (r = 0). Combining 
the results of (a) and (b), show that there is no 
net change in angular momentum of the muon 
during its trajectory from r = 0 to r = Ro. 
Explain why this means that the trajectory of 
the muon after it exits the return yoke (i.e. at 
r = R2) points back to the axis 

(d) These calculations have ignored multiple 

scattering—how would this change the result 

qualitatively? 


Discuss the implications for the measurement 
of muon momenta in this geometry. 


(e) 


We wish to measure a charged particle with 
momentum transverse to the beam line of 
pr = 500 GeV in a tracking detector immersed in 
a solenoidal field B = 2 T. If we require a momen- 
tum resolution o(1/pr)/(1/pr) = 0.3, estimate 
the spatial resolution required for the sagitta meas- 
urement. Discuss which detector technology would 
be appropriate. 


Consider a cylindrical drift chamber with a ra- 
dius of 4mm, operated at a voltage of 2kV. If the 
positive ions have a mobility u = lem? V~'s71, 
calculate the maximum drift time. How long does 
it take to accumulate 50% of the full signal? 


Consider an MWPC with anode spacing d. Con- 
sider the coordinate x in the plane of the anode 
wires. Calculate the root mean square difference 
in x between the location of a track and the near- 
est wire, and hence justify the claim that the 
resolution is d/V/12. 


Consider an n-doped semiconductor with carrier 
densities n and p of electrons and holes, re- 
spectively. Assuming that n>>p, show that the 
electrical conductivity o=nejie where ue is the 
electron mobility. A pn silicon microstrip detector 
(see Fig. 4.19), with the resistivity of the sili- 
con p=10kQcm, has a thickness w= 300 um. 
The relative permittivity of Si is « =11.6 and 
[le = 1350 cm? V—'s~'. Determine the bias voltage 
required to fully deplete the detector, Vaepiction- 


(4.14) 


(4.15) 


If an electron-hole pair is created at a distance x 
from the p-type electrode, calculate the drift time 
of the hole in terms of the mobility of the holes, un. 
For silicon, with un =480cm? V-'s~!, determine 
the charge collection times for holes created at 
depths z= 0.5w and z =0.9w. If the detector were 
operated at a bias voltage V =2Vdepletion, how 
would the charge collection times change? Hence 
discuss the advantages of operating the detector at 
a voltage greater than the depletion voltage. What 
limits the detector voltage that can be applied in 
practice? 


The leakage current in a silicon detector is a 
source of noise. If the leakage current in one chan- 
nel is eak and the signal is integrated over a 
time T, make a simple estimate of o(Qieax), the 
contribution of the leakage current to the noise 
on the charge signal. For a typical LHC silicon 
detector, T ~ 25ns (the bunch spacing). Esti- 
mate o(Qteak) for two cases: (a) Deak = 1nA 
(typical for an un-irradiated strip detector) and 
(b) Deak = 1 pA (typical for a heavily irradiated 
strip detector). Compare these noise values with 
the signal expected from a 300 m-thick silicon de- 
tector. Design a simple filter circuit to minimize 
the leakage current noise while keeping as much 
as possible of the signal. Suggest an approximate 
value for the cut-off frequency of your filter for an 
LHC microstrip detector. 


Consider a silicon microstrip detector with p- 
doped implants (strips) in n-doped bulk silicon 
(see Fig. 4.19). Make a rough sketch of the ‘weight- 
ing’ field (see Section 4.4.1) in the region around 
one strip and indicate on it the region in which the 
weighting field will be large. A charged particle 
crosses such a detector in a direction perpen- 
dicular to the plane of the silicon and creates 
electron-hole pairs uniformly along its trajectory. 
For a reverse-biased detector, which way will the 
electrons (holes) drift? By combining the above 
considerations, show that the resulting signal will 
be dominated by the motion of holes, rather than 
electrons. 


Static quark model 


The static quark model of hadrons is central to the understanding of the 
pattern of hadronic masses and quantum numbers. Originally devised 
with three flavours of quarks (u, d, s), it was extended to include the 
heavy quarks—first charm after the ‘1974 revolution’ when the J/psi 
was discovered and shown to be a qg state and then beauty some years 
later. The top quark was long anticipated on the basis of quark—lepton 
‘veneration’ symmetry after the tau lepton was discovered in 1975,! but 
proved to be enormously heavy when it was finally teased out of the 
data by the CDF and DO experiments at the Tevatron. The pattern of 
quark masses is one of the big unsolved problems of particle physics. 
In this chapter, we are concerned primarily with how the quark model 
helps our understanding of the phenomenology of mesons and baryons. 

The chapter starts with a reminder of the 2-component spin-5 algebra 
and its connection with the SU(2) group. Then, after a brief account of 
hadronic isospin (based on SU(2)), we explain how this approximate 
symmetry is extended, by including strangeness, to the flavour SU(3) of 
the static quark model.? 


5.1 Spin 5 


For a half-integer spin fermion, the eigenfunctions for spin up (m = +4) 


and down (m = —4) are 
1 0 
(o) and (7) 


The raising and lowering operators, defined as (see Chapter 2) 


S+ = Sg Lisy 


-0 1 (0 0 
oP = Vo 7? Fla o0 

This is easily demonstrated—for example 
s O\ /0 1\/0\_ A 
“EI 0 0) N1) NO 
1\_/0 1\/1)_ /0 
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The discovery of the associated neu- 


trino Ty is discussed in Chapter 8. 


2The definition and properties of the 
SU(n) groups are given in Chapter 2. 
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Working backwards, we get expressions for Sy and sy: 
1 1, 

Sr >= z+ AP s2), Sy = gils— Ti s4) 
and s, is derived from the requirement that 


-0-0 -0-40 


We can now write the spin-4 defining expressions in terms of the Pauli 
matrices (note the factor $): 


1 
a) L 
bia 1 == zx 

= 0 
2 
1 
Sy= | 4 = 5y 
z 0 
1 
= 0 
Sz = 2 mn 
z 0 1 9° * 
2 


The spin operator algebra can be summarized as 
1 1 1. 
97% ae = Eijk gk 
The link with SU(2) is clear: 


e The Pauli spin matrices are Hermitian and traceless. 
e There are three of them, as expected. 


e The set of unitary matrices U;(0;) = e7 212:0: form the fundamental 


3See Chapter 2 for more on groups and (irreducible)? representation of SU(2). 


eur e The pattern of combinations is predicted from representation 


theory. 


5.1.1 Combining two spin-Ż particles 


Two spin-4 particles can combine to form four possible spin states. The 
total spin can be either s = 1,5, € {—1,0,1} or s = 0, s = 0. We have 


1 
53) a) =|t.t) 
It Iy 2 L1 INL 1 
1-1) =|5,-5)]5.-3) (60) 


0,0) = 5 aa Ei V [Be-d) arg) = Vg U1 


1,1) = 


The s = 1 triplet is symmetric under the interchange of particles. They 
are deduced from the spin-raising/lowering operators and the Clebsch— 
Gordan coefficients.* 

The s = 0,5, = 0 singlet state is found by requiring it to be orthogonal 
to the s = 1,s, = 0 state. Note that the s = 1 states are symmetric 
under the interchange of particles 1 and 2, whereas the s = 0 state 
is antisymmetric. This illustrates what is meant by the representation 
notation 


2@2= 3 6.1 
sym asym 


where ‘sym’ and ‘asym’ stand respectively for symmetric and antisym- 
metric combinations of the spin states of the two particles. 


5.1.2 Combining three spin- + particles 

We will now show that the multiplicity of states is given by 2 & 2 & 

2=46262.° We start by adding a spin-up particle to the |1,1) to 

obtain the highest possible (maximally stretched) state |3,+3), then 

apply angular-momentum-lowering operators to ‘step down’ from there: 
3.3 


B z - re Ltt) HINN IM) 


$-3)= [Et HIN IID) 


3.3 


Two more states come from adding the third particle to the triplet. 
Requiring orthogonality to the S = 3,5, = +4 states, two S = 4 states 
occur: 


ba) = iE el tty) — im- It) 


3) m ajz Lt) =| 4) = | ty) 


These states are of mixed symmetry. 
Finally, two states are derived from adding the third particle to the 


singlet: 
11 i 
bs) = iann- Lut) 


B -5) z P tH) = IH) 
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+See Chapter 2. 


>The multiplicities must be even for 
half-integer angular momentum. 
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which are antisymmetric combinations under the exchange of particles 
1 < 2. The pattern of combinations is indeed 


20(2@2)= 46262 


sym mixs mixa 


where ‘mix(a)s’ means a state that is a mixture of symmetric and anti- 
symmetric parts but becomes purely (anti)symmetric under exchange of 
the first two particles. 


5.2 The quark model of hadrons 


Quarks are fundamental, point-like constituents of matter that carry 
either 3 or 2 electric charge. They exist in gq or qqq bound states—the 
hadrons—held together by the strong force. The strong force is mediated 
by gluons, which couple with equal strength to all quarks. The residual 
strong force holds together the nucleus in analogy to the van der Waals 
force between neutral atoms. 

Strong-force bound states can be organized by a classification that pre- 
dates the invention of the quark model; it arose from similarities among 
the hadrons observed in the 1960s, particularly from bubble chamber ex- 
periments. The observed patterns of the quantum numbers (mass, spin, 
parity, isospin, and strangeness) were crucial in the development of the 
quark model. 


Baryons 


e Baryons consist of three quarks or three antiquarks. 


1 3 5 ) 
Irra 


e One baryon, the proton, is stable. The neutron is almost stable, 
forming stable bound states with the proton in many atomic nuclei. 
However, a free neutron is not stable, decaying to a proton, an 
electron, and an antineutrino with a mean lifetime of 885.7 +0.8s. 


e They are fermions (with spins J = 


e Baryon number conservation was invented to explain the non- 
observation of the lightest baryons decaying to purely mesonic or 
leptonic final states. In the quark model, this translates into quark 
number conservation. 


Mesons 


e Mesons are quark—antiquark pairs 


e They are bosons, since they have integer spin (J = 0,1,2,... 


e All mesons are unstable. The most stable is the pion, with t(7*) = 
26ns (cr © 8m). 


Although the quark model is now taken for granted, it is useful to think 
about how physicists used the data to devise it and why the concepts 
matched so beautifully the experimental observations. 


5.2.1 Isospin 


Originally, physicists were intrigued by the similarity in mass and 
properties of the proton and neutron. Inspired by spin-4 doublets, 
it was postulated they were the same particle® but with a different 
projection of the third component of a new angular-momentum-like 
quantum number—isospin. In fact, this is not the case, but it is still 
a useful approximate symmetry for low-energy hadronic physics, where 
perturbative QCD is not valid. 

The key points are as follows: 


e |u) and |d) form an SU(2) doublet like spin 4: 
1 1 
I3|u) = zl I3|d) = =z) 


e As with normal (angular momentum) spin, the raising and lowering 
operators are 


Lld =|u), Iẹu)=]d), Ild) =0 


e Although the formalism is the same, isospin has nothing to do 
with angular momentum. 

e Isospin is an approximately conserved quantity in strong inter- 
actions (it would be exact if the masses of the u and d quarks were 
identical). 

e The validity of isospin is now understood to follow from two 
fundamental assumptions: 


(1) the approximate degeneracy in mass of the u and d quarks; 


(2) u and d have an identical strong coupling to the gluon. 


When dealing with antiquarks, one must be careful in applying charge 
conjugation to the quark wavefunctions. We choose a convention that 
allows Clebsch-Gordan coefficients to be applied in the same manner 
as with quarks, although this introduces a somewhat confusing minus 
sign: 


Clu) =—|@), Cd) = |d) 


where C is the charge-conjugation operator. The raising and lowering 
operators act on the antiquarks as follows: 


LW) =-l), Laed 1) =0, 1la)=0 
Consider next the SU(2) isospin combinations of 2 @ 2 
1 

2 


where the Clebsch—Gordan coefficients are taken from the 
a symmetric singlet state 


[I = 0,13 =0) = i aa + |ui)) 
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6 The mass difference mn — Mp is of the 
order of an electromagnetic correction. 


"We follow the convention defined ori- 
ginally for atomic physics by Condon 
and Shortley in their famous book The 
Theory of Atomic Spectra [68]. 
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and an antisymmetric triplet state 


|I =1, 5 = 1) = Jud) = |n*) 
|Z = 1,13 =0) = iE aa — |wi)) = |n°) 
|I =1, Å; = —1) = —|ud) = |r") 


8A mass difference again of a magni- 
tude compatible with an electromag- The triplet of pions is almost degenerate in mass: m(t) = 140 MeV, 
netic effect. m(n°) — 135 MeV.8 
Isospin is also useful for baryons. As with spin, we expect a symmetric 
quadruplet and two mixed-symmetry isospin doublets: 2 ® (2 ® 2) = 
49292, which we can write out explicitly: 


3 3 
i = — i = — = 
| 5743 3 juuu) 
3 1 il 
i = 573 = 5) = [É (de +|udu) + |wud)) 
35 1 1 
i 573 5) 3 (|ddu) + |udd) + |dud)) 
3 3 
i =. Is -) |\ddd) 
1 1 
r= yi = 5) = ii (2|uud) —|duu) — |udu)) 
1 1 
i yB 5) Jic (2|ddu) — |udd) — |dud)) 
1 1 
i = yi = z)= 3 (Judu) — |duu)) 


i sls 5) i duad) — laud 


An example of isospin analysis: A decays 


°Higher-mass baryon states are not e Baryon number conservation requires the A to decay to p or n.’ 


ible. : A 
ee e Strong-interaction decays are favoured by a factor O(10*) over 
electromagnetic decays. 
e As the strong interaction dominates, we can use isospin to 
understand relative rates using 


3 1 11 1 1 
A Zn 8 eel ee = ee 


Irt) = |1, 1), r°) z |1, 0), |r) = |1, —1) 


e Using Clebsch—Gordan coefficients, we expand the J = 2 in 
products of J = 5 and I = 1 states: 


ofa) Va fe-a)hny a 


2 
=f Indie) + 4/5 Ip)la°) 
e Deducing branching ratios, we have 
R(A+ => rp) _ [np | AYP va 
R(A+ > ntn) rtn ]| At)| Mi 


e Similarly, we can estimate the relative A cross sections for 
formation in mp scattering at ys ~ m(A): 


p) KA? lap? _ 1 
otp) KAF [rtp 3 


e If we assume that the cross section on resonance is dominated 
by the A, the data are in reasonable agreement with the isospin 
assignment: o(7~ p) ~ 70mb, o(a*p) ~ 200 mb. See Fig. 5.1. 


e Remember that isospin analysis is not exact—its usefulness arises 
from the approximate degeneracy of the |u) and |d) masses. 


Although useful, isospin could not account for long-lived particles, 
originally labelled ‘V particles’, that were first observed in 1947 (in 
Manchester) with a mass ~500 times that of the electron (see Fig. 5.2). 
These new particles—known as ‘strange’ particles—were assigned a new 
quantum number and SU(2) had to be expanded. 


100 


o (mb) 
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Fig. 5.1 Total cross sections for +p 
(dotted line) and m~p (solid line) as 
functions of the pion beam momentum 
in GeV/c. From the PDG [114]. 
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Fig. 5.2 One of the early ‘V particles’ 
observed in 1947 in a cloud chamber ex- 
posed to cosmic rays by Rochester and 
Butler {123]—then working in Black- 
ett’s group at Manchester University. 


10 Generically known now as a flavour 
quantum number. 


Fig. 5.3 Fundamental SU(3) flavour 
representations for the (u, d, s) and 
(a, d, 8) triplets, as functions of the 

third component of isospin 13 
(horizontal) and strangeness S 
(vertical). 


V- particle 


tracks 


5.2.2 Strangeness and expansion to SU(3) 


With isospin in hand to describe up-ness and down-ness, strangeness is 
postulated to be a third quantum number that a hadron may possess.!° 
For historical reasons, convention dictates that s = —1 for |s} and s = +1 
for |5}. Assuming that SU(3) is valid, we expect 3? — 1 = 8 fundamental 
operators. 

The fundamental representation of SU(3) comprises eight 3 x 3 
matrices: 


0 1 0 0 —i 0 
A= 1 0 0], À2 = a 0 0 

0 0 0 0 0 0 

1 0 0 0 0 1 
A3 = | 0 —1 0}, Ag = | 0 0 0 

0 0 0 1 0 0 

0 0 —i 0 0 0 
A5 = | 0 0 0], Ag = |0 0 1 

i 0 0 0 1 0 

0 0 0 T 1 0 0 
Me=10 0 =i}, Bs. 0 1 0 

0 i 0 0 0 —2 


SU(3) has three SU(2) groups embedded within it. In addition to 
isospin (J-spin), there are U-spin and V-spin, which are the doublets 


d 
quark and antiquark triplets is shown in Fig. 5.3. 


of (;) and G) respectively. The fundamental representation for the 
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The raising and lowering operators are 
Lid =u), I_|u)=|d), 4a) =—|d), — I_|d) = —a) 


Us|s)=|d), U_|d)=|s),  Us|d)=—|8), U- 
Vils)=|u),  V_|u) =|s), Vila) =—|5), V- 


Any other combination is zero. The Condon-Shortley convention is used 
for the antiquarks, which gives rise to the ‘extra’ minus signs. Only two 
of the three are needed to describe and navigate through a multiplet, 
I-spin and U-spin being most commonly used. 


5.2.3 Mesons 


We now have the tools to extend beyond the r+, 7°, 7~ states of SU(2) 
qq combinations and identify all the SU(3) qq states. We start with the 
simplest case of J = 0 pseudoscalar mesons. From SU(3), we expect 


383=9= 1 8 


sym mix 


where ‘sym’ and ‘mix’ stand for symmetric and mixed symmetry, 
respectively. The six states with non-zero J and non-zero § have un- 
ambiguous quark content. The final three, with J = S = 0, need a little 
more thought: 


e The symmetric singlet is ‘obvious’ by inspecting the perfect 
symmetry of the flavour wavefunction. It is now known as the 77’: 


In’) = of (lea) + |dd) + |s5)) 


The remaining two must be part of the octet of mixed symmetry. The 
first step is to start with the ‘outer’ states of the octet and apply the 
flavour-lowering operators to reach the centre: 


I_|ud) = |dd) — |ua) 


U_|ds) = |s3) — |dd) 


V_|us) = |s5) — |uu) 


However, only two of these three equations are independent. So we 
proceed as follows: 


e We choose one to be the well-established 7°, the isospin-triplet 
partner of the 7+, 77: 


|n°) = [żda — |uu)) 


124 Static quark model 


PEREN PETE AAR AAARIARANAAAAN 


b 
1 
2 


Fig. 5.4 SU(3) pseudoscalar meson 
nonet states as functions of I3 and S. 


Mtn fact, it is also slightly broken in 
the pseudoscalar case. 


e This leaves the last member of the octet to be deduced by requiring 
that its flavour wavefunction be orthogonal to the |7°): 


In) = a(|s8) — |ua)) + B(|s8) — |dd)) 
= (i (lua + |dd) — 2|s8)) 


where the constants are derived using (7° | 7) = 0 and (7 | n) = 1. 


The nonet of pseudoscalar meson states are plotted as functions of T3 
and S in Fig. 5.4 and their properties are listed in Table 5.1. 

The J = 1 vector mesons are also well-established states. As we might 
expect, they exhibit the same pattern of states as the J = 0 mesons: 
3@3=86l. 

The nonet of vector meson states are plotted as functions of [3 and S 
in Fig. 5.5 and their properties are listed in Table 5.2. 

The notable difference between the pseudoscalar and vector mesons is 
that with the latter, for the Iz = 0, S = 0 states, SU(3) is not exact and 
‘octet-singlet’ mixing occurs.'! Experimentally, one state (¢) is observed 
to decay largely to kaons, and the other (w) nearly always to pions. We 
therefore assume that the states with [3 = 0, S = 0 are maximally mixed, 


I Tz S Meson Composition Decay Mass (MeV) 

1 1 0 Tt lud) mt — utv, 140 

1 —1 0 T |tid) TS Dy 140 

To. 

1 0 0 70 ‘F (\dd) — u Say 135 

1 1 

5 5 1 K” us) K+ > u*v, 494 

1 1 z 

5 =a 1 K? d5) K? + rn(r) 498 

1 1 _ _ _ 2 

1 K ts) K- > wd, 494 

2 2 

1 1 = - _ 

5 5 —1 K? ds) K? + rn(r) 498 
1 x n X 

0 0 0 n Eua + |uti) — 2|s3)) n> yy 549 
il = _ — 

0 0 0 n i ua + [uu) + |s3)) N > NTT or py 958 


Table 5.1 Properties of the SU(3) pseudoscalar meson nonet states. 
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I fs S Meson Composition Decay Mass (MeV) 

il 1 0 pt lud) 

1 —1 0 P jud) P> TT 776 
0 Lagg = 

1 0 0 ọ z (ldd) — |un)) 

1 1 

Z 2 *+ z 

5 5 1 K us) 

1 1 

2 osa T Ko d 

2 2 a 

i 4 _ g K*> Kr 892 

1 1 = 2 

= - —l1 kr? d 

2 2 a 


(æ) 
(=) 
© 
€ 


Ts oe 
Eua +|uti)) w— ntron? 783 
0 0 0 ¢ |s3) $> KK 1019 


Table 5.2 Properties of the SU(3) vector meson nonet states. 


such that the quark composition is given by 


|p?) = E — |uu)), BR(p° > +27) = 100% 
w) x E + |uū)), BR(w 3 mtn r’) = 90% 
|) =~ |s8), BR(¢ > KK) = 84% 


BR(¢ 3 ata 2°) = 15% 


5.2.4 Baryons 


The baryon wavefunction is made of four parts, 


Fig. 5.5 SU(3) vector meson nonet 
y= WspaceYspin Vavour Veolour states as functions of I3 and S. 


and we consider only ground-state baryons (p, n, A, etc.) that have no 
orbital angular momentum (L = 0). space is symmetric. The colour 
wavefunction Weolour is always antisymmetric: 


Weolour = id WRG) + |GBR) + |BRG) = |GRB) = |BGR) = |RBG)) 


With WspaceWcolour being antisymmetric and the overall fermionic wave- 
function required to be antisymmetric, Wspin@favour Must be symmetric. 
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Vspin and Waavour May be a mixture of symmetric and antisymmetric 
states, but the total spin-flavour wavefunction must be symmetric. 


e We deal with {spin first. The possible combinations are 


2@(2@2)=8= 462 6.2 


sym mixs mixa 


where the four symmetric states (‘sym’) have spin $ and the other 
two (‘mixs’ and ‘mix,’), with mixed symmetry, have spin Ł, 

e The SU(3) decomposition of three flavours is found by first 
combining two quark states, then adding the third: 


383= 6 8.3 
SH NS 
sym asym 


sym mixs mixa asym 


where ‘sym’, ‘mixs’, and ‘mix,’ are as defined above and ‘asym’ 
is the totally antisymmetric three-quark combination. 

e Finally, the possible symmetric (SU(3), SU(2)) combinations are 
the symmetric (10,4) and the mixed symmetric—antisymmetric 


VE L8, 2)mixs + (8, 2)mixa l- 


The ground-state octet (spin 5) and decuplet (spin 3) are shown in 
Fig. 5.6. 


Fig. 5.6 SU(3) ground state (spin-4) baryon octet states and the (spin- 3) decuplet first excited states. 
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Fig. 5.7 Photograph and line diagram 
of an QT event from [44]. 


There are two final points: 


e The Att is manifestly symmetric in YavourspinYspace- This 
observation was the driving force for an additional degree or 
freedom—the colour charge. 


e The quark model was used to predict the Q7, a strangeness-(—3) 


baryon.!? 12To be historically accurate, (broken) 

SU(3) flavour symmetry was used. This 

The QT was discovered using Kp interactions. A bubble chamber included quark-like objects, but did not 

photograph [44] of the first observation of an Q7 is shown in Fig. 5.7. The ee aero 

new particle was observed to decay via three weak decays. The decays key papers on QCD and confinement 

are clearly weak because the intermediate particles travel an appreciable were published. 

distance before decaying in turn. This implies a longer lifetime than 
decays via the electromagnetic or strong interactions. The production 


and decay chain is 
K-poQ-KtK® ... QT 3 Beam... B95 Am ... A? > pa 


3 3 


where each decay involves a decrease in strangeness.! 130- has strangeness —3 and spin 3. 


5.2.5 Deriving the complete spin—flavour 
wavefunction 


A final step is to obtain the explicit quark model wavefunction for the 
spin-up proton. We concentrate on the non-trivial Yspin favour parts, 
which must be symmetric. From an inspection of the three spin-4 com- 
binations, for angular-momentum spin (Section 5.1.2) and isospin (end 
of Section 5.2.1), we note that the |S = 4,5, = 4), and |I = $, 3 = 4) 
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14The zoo continues to grow as more 
precise experiments coupled with more 
sophisticated analysis techniques un- 
cover the states with non-zero orbital 
angular momentum among the quark 
constituents. 


15The tau lepton with a presumed tau 
neutrino were discovered in 1975 by the 
SLAC /Berkeley collaboration using the 
e+e~ collider (SPEAR) through a lar- 
ger rate for ete — et uF +X? (where 
X° is missing energy) than could be 
explained by higher-order purely elec- 
tromagnetic processes. 


parts are of mixed symmetry. So we must take care to combine these 
wavefunctions appropriately. Using Mg and Ma as generic labels for 
symmetric and antisymmetric combinations, respectively, we get 


|p") = V3 [Wspin (M S ) Wlavour (M S ) oe Wspin (Ma) favour (Ma)] 


where 


viniMs) =| 5.5 ) = VETO- i- It) 


voiniMa) =)= iann- im) 


and, with different notation, but identical in content: 
1 
WAavour (Ms) = C (2|wud) = |\duw) = Judu) ) 
1 
Waavour(Ma) = (lua = |\duw)). 


Putting this all together, we can write out the complete proton 
wavefunction: 


j1 
|p") = zg (2lututat) = Juburdt) = jutud’) 
+ Qlutdtuty — jubdtuty — jubdtuty 
+ 2dbututy — |dtutut) — |dtutut)) 


x |Wcolour) 


5.3 Heavy quarks 


By the mid-1960s, broken SU(3) flavour symmetry was becoming estab- 
lished as a plausible explanation for the patterns appearing among the 
zoo of hadronic states being discovered. The underlying quark model 
was much more contentious—it was not clear if a quark, which had 
never been observed in isolation, was a real particle or just a convenient 
mathematical construct. The experimental and theoretical evidence for 
quarks being real particles is discussed in Chapter 9. Flavour SU(3), as 
discussed in this chapter, was used to make successful predictions for 
hadrons composed of the light u, d, and s quarks. With the discovery 
of charm and the c quark and some years later of the b quark, a whole 
new world of charm and beauty hadrons opened up. Although most of 
the experimental work was carried out at ete~ colliders, the first evi- 
dence for b quarks came from a fixed-target proton beam on a nuclear 
target. At the time, it was confidently expected, on the basis that there 
should be six types of quark to match the six leptons,!° that the top 


(t) quark would be found at the higher-energy e*e7 colliders then being 
constructed in Europe and Japan, but it was not to be. The first evidence 
for the t quark came from the Tevatron using pp collisions at 1.8 TeV 
centre-of-mass energy. This section covers the discovery and properties 
of the b and c quarks; the discovery of the t quark and its properties are 
covered in detail in Section 8.6. 


5.3.1 The charm quark 


The need for a fourth quark was already being discussed (see Chapter 7) 
before the J/z and the D mesons were discovered.'® Indirect evidence 
for charm arrived in 1974 from two experiments: 


e Richter et al. [40], using e*e~ collisions at SPEAR (the Stan- 
ford Positron Electron Asymmetric Ring) with CMS energy ys ~ 
3 GeV, measured the three processes ete — hadrons, ete~ — 
pt, and ete~ — ete (see Fig. 5.8(a)). 

e Ting et al. [39] used 28GeV protons on a beryllium target at 
the Brookhaven National Laboratory to study the invariant mass 
spectrum of ete~ pairs produced by p+ Be > ete~ + X (see 
Fig. 5.8(b)). 


Ting et al. named the new state J; Richter et al. chose ~ (a rare example 
of a particle looking like its name—Fig. 5.8(c) is an event display from 
the Stanford group [9] showing Y > wut followed immediately by 
wy —> pty). What was striking in both cases was how exceptionally 
narrow the Breit-Wigner resonance was to fit the data in the vicinity of 
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16The J/w is a cé state and thus has 
net charm zero, D mesons are cg or qé 
with charm quantum number +1. 


(a) (b) 
ET J T T T T T F T T a 80 = i i 
E ete” —> hadrons 3 242 Events —>; <= 
1000 E E| 70 | SPECTROMETER 
g E J 4 Invariant 
M A normal 
x 100 p 7 6 current mee of 
E 3 0 —10% current e'e pairs 
C 5 J > 
10 Horm i r i r i ıı ı 3 50 
fete pp cos |< 0.6 7 
100 Ẹ a u a 
= E E a 40 
fe) i al E 
= [ 7 Z 
b 10 W 
Eto 3 g” 
1 i 1 | 1 | 1 | 1 I L] 20 
200 + ete —> e*e7 |cos0|<0.6 7 
g NE E 10 
= A i j 
20 U fi iÍ | i—i | 0 fi Á Lczzd] [ñ 
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E.m. (GeV) me” [GeV] 


Fig. 5.8 Plots and pictures from the J/ẹ% discovery papers—a striking object indeed! (a) from [40], (b) from [39], and (c) from [9]. 
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17 The correct PDG names for the J/w 
and y’ are J/w(1S) and 7(2S), respect- 
ively. The open-charm mesons are the 
D and D states. 


l85ee Exercise 5.5 at the end of the 
chapter. 


3 GeV. The J/w mass and width are 3096 MeV and 93 keV; compare this 
with a typical hadronic resonance, the p(1700), with mass ~1720 MeV 
and width ~250 MeV. The ete~ and ptpy~ decay modes of the J/y 
were equal and together amounted to ~12%, with hadronic modes ac- 
counting for the other 88% of decays. There was no doubt that the 
J/w was a hadronic state, but its decays were highly suppressed—by 
a factor of roughly 1/2500! Many models were suggested—from new 
quarks to supersymmetry, but the simplest turned to be the former and 
it was named the charm quark, with charge +3 and carrying the new 
charm quantum number in analogy to strangeness.!” Data taken at e+e7 
colliders at energies above the J/w (and ~'(3686) showed evidence for 
a threshold being passed, with new meson states being pair-produced, 
consistent with cq + Gq pairs. 


5.3.2 The beauty quark 


A Fermilab experiment [86] using the 400 GeV proton beam on a nuclear 
target (A = Cu or Pt) measured the dimuon invariant mass spectrum in 
p+A — pty +X. It was expected and observed that the dimuon mass 
spectrum would fall exponentially. The rapidity—invariant mass double 
differential cross section was fitted at y = 0 by an expression of the form 


d?o 


ZÄ —bm 
dm dy|,,—o9 i 


where m is the invariant mass. A fit to the mass range 6 GeV < m < 
12 GeV, excluding the range 8.8-10.6 GeV, gave b = 0.98 + 0.02 GeV—!. 
A statistically significant enhancement was observed at ~9.5 GeV. The 
experiment did not have sufficient mass resolution to resolve the excess 
above the steeply falling dimuon mass, but it could be fitted with one 
or two resonances. 

The beauty quark is also known as the bottom quark, since it forms 
a doublet with the top quark. 


5.3.3 The top quark 


The top quark was discovered at the Tevatron in pp > tt + X at ys = 
1.8 TeV. Its discovery and properties are covered in detail in Section 8.6. 


5.3.4 Charm and beauty states 


Both the c and b quarks can form meson and baryon states with the light 
quarks. The t quark decays too rapidly to form such bound states.!® 
This section gives a brief overview of heavy flavour states. The charm 
and beauty states, particularly the B mesons, have been studied in great 


detail at dedicated e*e~ colliders, providing a wealth of experimental 
results, only a fraction of which can be covered here. Oscillation phe- 
nomena and mixing of D°—D° and B°—B® states are discussed in some 


detail in Chapter 10. 


Meson states 


The heavy meson states are Qq systems; the lightest with zero orbital 
angular momentum have J? = 07,17. Table 5.3 shows those for cg and 


the antiparticles cq. 
For the D mesons, 


Mp+ — Mpo = 4.77 + 0.10 MeV 


and 


Imp — mpg| = 2.397.693 As 


Similar details are given for the bg and bq mesons in Table 5.4. For 


the B mesons, 


mpo — Mpt = 0.33 + 0.06 MeV 


and 


[mpo — mpo | = (0.507 + 0.005) x 107" As", or, equivalently, 


= (3.337 + 0.033) x 10-1? MeV 
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State JF I Mass (MeV) Lifetime (x10715 s) or Width 
_ 7 1 
Dt cd 0 5 1869.60 + 0.16 1040 +7 
a 7 _ 1 
D ed 0 5 1869.60 + 0.16 1040 +7 
r 7 _ 1 
D cul 0 5 1864.83 + 0.14 410+ 1.5 
z 7 E 1 
D? Cu 0 5 1864.83 + 0.14 410+ 1.5 
D* (2007)? cul 17 1 2006.96 + 0.16 T < 2.1 MeV 
D* (2010)* cd 17 1 2010.25 + 0.14 T = 96 + 22 keV 


Table 5.3 Lowest-lying charmed meson states; the c quark has charm C = +1. 
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State J J Mass (MeV) Lifetime (x1071? s) 
: 1 
Bt ub,ub 07 5 5279.17 + 0.29 1.638 + 0.011 
> — 1 
B°,BO db,db 0 z 5279.50 + 0.30 1.525 + 0.009 


Table 5.4 Lowest-lying beauty meson states; the b quark has beauty B = +1. 


As for the neutral K mesons, the question arises of CP violation and 
flavour oscillations. These in turn depend on the small mass difference 
between the states equivalent to the K8, K} states. CP violation and 
charm or beauty oscillations for neutral meson states depend very 
sensitively on the mass differences; these questions are covered in 
Chapter 10. 


5.3.5 Heavy QQ systems 


Positronium—the electromagnetic bound system of an electron and 
a positron—has provided a very ‘clean’ laboratory for understanding 
quantum electrodynamics (QED) without any nuclear complications. 
Similarly, the QQ systems provide a laboratory for understanding 
QCD. The mass scale provided by the heavy quarks means that 
perturbative methods can be used for the QCD calculations. The 
success of these calculations in describing the details of the QQ 
systems was important for the development of QCD itself. Of the 
three heavy quarks, only cé and bb systems exist. The tt bound 
system does not have time to form before the top quarks have 
decayed. 


5.3.6 Charmonium 


The first plots with evidence for the J/y) are shown in Fig. 5.8. Once the 
excitement of the discovery in quick succession of the J/y and y’ died 
down, the focus turned to establishing the properties of these states. 


Isospin 


The first question to be asked is how much u-ness and d-ness there are 
in this new state. The J/w isospin assignment comes from observing 


BR(J/) > ptr) = BR(J/w > pr?) = BR(J/) > pr”) 


Consulting the Clebsch-Gordan tables gives the J/ọ isospin: 
|I = 0, I = 0) (i.e. zero isospin!): 


J: 2 1 0 
M: 0 0 0 

Mı m2 
1 1 1 

=) = |1,1)}1,-1 1 —1 
lpm) = 11, 1)11,—1) vi Vs Ve 
2 1 
O70) = |1 1 = = = 
1 1 1 

—,~+\ _ = = 

t= a 1 fk hf 


JPC 


Using SPEAR, the SLAC/LBL team were examining ete~ > "u`. 
There are two possible processes: 

(1) resonant via a J/w decaying electromagnetically; 

(2) non-resonant via a virtual photon. 
The lower two plots in Fig. 5.8(a) show o(ete~ — tp) anda(ete” > 
ete) at CMS energies around 3.095 GeV. The middle plot shows a 
clear dip on the low-mass side of the resonance. This is evidence of the 
interference between the resonant (via the J/w) and non-resonant (via 


a photon) channels, which can only happen if the J/ọ has the same 
quantum number as the photon: JPO = 177. 


Width 


Consider the Breit-Wigner formula for ete~ > J/V > ee”: 


2J+1 T24 
E) = 4” ce 5.1 
o(E) = 40X51) Qs0 + 1) (BE Ep)? +14 ey 
For this reaction, J = 1,5, = S2 = $, so 
2 Tere /4 
o(E)JJy—>ete- = 3TA (5.2) 


(E — Eg)? +T?/4 


This can be integrated to find the total cross section, giving 


eo) 242 2 
T _ 
o= f Dine B= (EE) P68) 
` 2 T 
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From the measurements made by Richter et al. (shown in Fig. 5.8(a)), 
we can deduce the following: 


e The integrated cross section for ete~ > J/y — eFe~ ~ 800nb 
MeV. 

e Toe- /T = BR(J/w => ete—) x 6%. 

e Taken together, this gives a value of [ ~ 93 keV. 


This is much narrower than the full widths of the established JPO = 177 
mesons: K*(1410) with width 232 MeV, (770), with width 149 MeV, 
and even $(1020) with the relatively narrow width of 4.3 MeV. This 
shows that the J/W cannot be composed of combinations of the light 
(u,d,s) quarks. The very narrow width can be understood if the state is 
made of heavier quarks (charm). 


Charmonium states 


Soon after the discovery of the J/, more JPO = 177 cé states were 
found: 


w(2S), T= 320keV, BR(1b(3686) + J/yrr) ~ 50% 
¥(38), T =27.3MeV, BR(2(3770) > DD) ~ 85% 


Further c@ states with other values of JPC were later discovered: the 
charmonium spectrum is shown in Fig. 5.9. 


Charmonium decays are governed by kinematics: 


e m(J/w) < mY) < 2m(D) < my”). 

e An odd number of gluons is required, Two or three gluons can be 
in a colour singlet state, so that a decay to two or three gluons 
would be compatible with colour conservation. For the analogous 
decays to photons, we can see that C-parity requires three photons. 
Gluons are coloured and are therefore not eigenstates of C-parity, 
however. It turns out the two-gluon decay mode is also forbidden, 
like the photon. 


A single gluon is not possible, since the final state must be 
colourless. 


e Therefore, the minimum number of gluons for a strong decay is 
three (Fig. 5.10(a)). 


e This means that the hadronic decay rate depends on af (at the 
J/w mass scale), where a is the squared strong coupling constant, 
in analogy to the fine-structure constant «QED. 


The rapid decay of (3770) + DD is via a single gluon exchange, x a2, 
since it is above threshold for strong decays to a pair of charmed mesons 
(Fig. 5.10(b)). 

To see that we need 3 gluon decay modes, first consider the decay of 
the J/V. This is below threshold to decay into charm mesons. There is 
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at” u) _ Fig. 5.10 (a) J/y decay to a three- 
D pion hadronic final state via three glu- 
c) ons. (b) Y(38S) decay to DD. 


another hadronic decay, into pions. Because there are no charm quarks 
in the final state the charm/anticharm pair in the initial state must an- 
nihilate for this decay, and there must be gluon propagators connecting 
the initial and the final state. As this is again a strong decay the conser- 
vation laws for the strong interaction must be obeyed at all stages of the 
process. The relevant conserved property is the charge conjugation eigen- 
value. For the initial state with L = 0 and S = 1, C = (—1)°+! = —1. 
For the final state a decay into a single pion would not satisfy momentum 
conservation. A decay into mr (C = +1) would violate charge conjuga- 
tion, so the final state must have at least three pions. More important 
is the structure of the intermediate gluon state. It cannot be a single 
gluon, because the initial state is a colour singlet (it’s a particle), and 
a single gluon can never be a colour singlet. The gluon is not an eigen- 
state to charge conjugation, because of its colour content (e.g. a gluon 
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19th an ete- collider, the energy of 
a resonance can be determined by the 
beam energy, which can be measured 
very precisely. In a hadron production 
experiment, the resonance energy is de- 
termined by the final-state particles 
(electrons or muons) and the resolution 
is lower. 


with rb becomes —b7, the — sign occurs for similar reasons as the — sign 
in the charge conjugation eigenvalue for the photon), but a multi-gluon 
state can be a charge conjugation eigenstate. A two-gluon colour singlet 
state will contain contributions like (rb)(br). The charge conjugate state 
will then have (—bF)(—rb), so the charge conjugation eigenvalue of the 
two-gluon state will be C = +1, and there can be no strong decay of 
a J/W into two gluons. A three-gluon state would contain elements like 
(rb) (bg) (gr) + (br) (gb) (rg). One of these combinations will have C = —1, 
and so a decay into three gluons is possible. 


5.3.7 Comparison with positronium 
Positronium 


e Positronium is an ete~ bound system analogous to the hydrogen 
atom. 


e The energy levels are predicted by the non-relativistic Schrodinger 
equation with the Coulomb potential, but with a reduced 
mass me /2. 


e Singlet ‘Sp and triplet S4 states are split by spin-spin interactions. 
e States with the same principal quantum number are split by spin— 
orbit interaction. 


Differences between charmonium and positronium 


e In charmonium, the size of the strong coupling as compared with 
the electromagnetic aggp leads to larger splitting from spin-spin 
and spin-orbit interactions than would be the case in atomic 
physics; 

e The potential in charmonium is not —aggp/r but as/r + Kr. 
At short distances, the potential is Coulomb-like, but at large 
distances, the linear confining term dominates. 


5.3.8 Bottomonium 
A comparison of the ratio 


a(ete~ —> hadrons) 
glare” > utu) 


in the regions near the p°, J/7, and Y resonances is shown in Fig. 5.11. 
Following the discovery of the narrow excess in -pair invariant mass 
distribution measured in p + A— u” p~ +X already described, the re- 
gion of 9-10GeV in centre-of-mass energy was studied by the ete7 
colliders at SLAC, Cornell, and DESY, where the better resolution 
of these colliders!® enabled the system to be resolved into three nar- 
row resonances Y(15'), Y(25), and Y(3S), with masses of 9.46, 10.02, 


? 


and 10.36 GeV, respectively. A broader resonance, the Y(4S) with 
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mass 10.58 GeV, decayed to mesons via BB pairs, each with a mass 
~5.28 GeV. The bottomonium spectrum is shown in Fig. 5.12. 


5.4 Exotic hadrons 


The Standard Model allows more possibilities for hadrons than con- 
sidered so far. We need to add gluons to quarks as building blocks, 
considering different ways in which gluon fields can be configured 
‘connecting’ or ‘gluing’ quarks while making sure that the outcome is 
colourless. 

The simplest object is called a glueball, a quarkless set of gluons that 
is colourless as a whole. No glueball has been unambiguously identified 
so far; expected lifetimes are short and glueballs would couple easily to 
conventional mesons, making unambiguous identification difficult. 
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Fig. 5.11 The ratio R in the ete~ 
CMS energy regions: (a) near the p; 
(b) near the J/y; (c) near the Y. From 
the PDG data plots [115, Fig. 50.6]. 
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Fig. 5.12 The bb bottomonium states. From the PDG diagrams [114, p. 1109]. 


Having glue fields inside a hadron leads to hybrids. It is possible 
to excite gluonic degrees of freedom, making gluonic fields vibrate for 
example. No hybrids have been found so far. 

Then there are tetraquarks, four-quark systems such as qqqq. They 
could be of two types: either a ‘molecule’ of two mesons, with one qq 
orbiting another qq, or a diquark system, with a qq diquark binding 
to a qq antidiquark (known also as baryonium). There are some candi- 
dates for tetraquark states. The most promising one, not matching the 
mass, lifetime, and other quantum numbers of any conventional meson 
(known or predicted), is the narrow X (3872) state discovered by the 
Belle experiment. The X (3872) can decay to tz J/w and to yJ/w, 
suggesting that it contains charm and anticharm quarks. Its quantum 
numbers, JP? = 1++, are now well established by the LHCb experi- 
ment. Whether it is a ‘molecule’ or a diquark system or a mixture of 
states is not yet known. 

What tetraquarks are in relation to mesons, pentaquarks are in re- 
lation to baryons: gqqqg. Two states discovered by LHCb, Pc(4450)* 
and Pc(4380)* [106], are good candidates for pentaquarks consisting of 
uudcc quarks. How the quarks are bound remains to be established. 

There might be even more complicated systems. For example an equal 
mixture of u, d, and s quarks could exist as a state as simple as one 
in which two A? baryons are bound together (if the mass of such a 
state were below the A°-nucleon threshold then its lifetime would be of 
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the order of days or months), or even macroscopic systems with larger 
number of quarks. There is also a possibility of an analogue to a neutron 
star—a quark star. 


Chapter summary 


e Hadrons as qqq and qq states. 


e Isospin and SU(3) flavour. 


e Flavour quantum numbers: strangeness, charm, beauty. 


e Isospin and SU(3) flavour. 


e Patterns of meson and baryon states. 


e Heavy QQ states: charmonium and bottomonium. 


e Exotic hadrons. 


Further reading 


e Griffiths, D. (2008). Introduction to Elementary Par- 
ticles (2nd edn). Wiley. Chapter 5. 

e Perkins, D. H. (1987). Introduction to High Energy 
Physics (3rd edn). Addison-Wesley. Chapter 5. 

e Halzen, F. and Martin, A. D. (1984). Quarks and 


Leptons: An Introductory Course in Modern Particle 
Physics. Wiley. Chapter 2. 


e Martin, B. R. and Shaw, G. (2008). Particle Physics 
(3rd edn). Wiley. Chapter 6. 

e Close, F. E. (1979). An Introduction to Quarks and 
Partons. Academic Press. Chapters 1—4 give a more 
advanced introduction to the quark model and the 
associated group theory. 


Exercises 


(5.1) 


(5.2) 


(5.3) 


Using the information in Section 2.2 on addition 
of angular momentum and Clebsch—Gordan coeffi- 
cients, check that you can account for all states in 
the baryon multiplets constructed out of u and d 
quarks only. 


Follow through the calculation of the A spin-3, I- 
spin-3 resonance branching ratios to pion—nucleon 
states. 


Referring to Fig. 5.1 and using the PDG data 
tables, give an explanation of the difference in 1*p 
and ap total cross sections at 7~ beam energies 


(5.4) 


of around 1GeV in the fixed-target laboratory 
frame. 


J/w, verify the formula for the total cross-section 
integral given in eqn 5.3. Use it and the data given 
to estimate the total width of the resonance. Why 
was it so surprising? 

The top quark has a mass of 172.0 + 0.9 + 1.3 GeV 
with an upper limit on its full width of T < 
13.1 GeV. Estimate its mean lifetime and, by con- 
sidering the available phase space for its decays, 
explain why there are no hadronic states including 
the top quark. 
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1 Albert Einstein, 1879-1955. 


2Note that in that chapter, transform- 
ations are in an ‘active’ sense (i.e. 
the coordinate system does not change 
but vectors do) rather than the ‘pas- 
sive’ sense (i.e. the coordinate system 
changes but vectors do not) considered 
in this book. It should also be noted 
that the metric in [110] is — + + +, in 
contrast to the one used in this book, 
which is + — — — (see Note 5). 


3 Greek indices bv =0, 1,2,3 and Latin 
indices i, j = 1, 2,3. 


Relativistic quantum 
mechanics 


The aim of this chapter is to introduce a relativistic formalism that can 
be used to describe particles and their interactions. The emphasis is 
on those elements of the formalism that can be carried on to relativistic 
quantum field theory (RQF), which underpins the theoretical framework 
of high-energy particle physics. 

We begin with a brief summary of special relativity, concentrating 
on 4-vectors and spinors. One-particle states and their Lorentz trans- 
formations follow, leading to the Klein—Gordon and Dirac equations 
for probability amplitudes, i.e. relativistic quantum mechanics (RQM). 
Readers who want to get to RQM quickly, without studying its foun- 
dation in special relativity, can skip the first sections and start reading 
from Section 6.3. 

Intrinsic problems of RQM are discussed and a region of applicability 
of RQM is defined. Free-particle wavefunctions are constructed and par- 
ticle interactions are described using their probability currents. Gauge 
symmetry is introduced, which allows the interaction between a particle 
and a classical gauge field to be described within the formalism. 


6.1 Special relativity 


Einstein’s! special relativity is a necessary and fundamental part of any 
formalism of particle physics. We begin with a brief summary. For a full 
account, refer to specialized books, for example [128] or [127]. Theory- 
oriented students with a good mathematical background might want 
to consult books on groups and their representations, for example [46], 
followed by introductory books on RQM/RQF, for example [107]. Here 
we are only going to present conclusions without derivations, avoiding 
group-theoretical language and aiming at a presentation of key concepts 
at a qualitative level. Chapter 41 in [110] on spinors is recommended.” 

The basic elements of special relativity are 4-vectors (or, strictly 
speaking, contravariant 4-vectors) such as a 4-displacement® g“ = 
(t, x)= (x°, 21,27 ,2°)=(2°,2") or a 4-momentum p“=(E,p)= 
(p°, pt, p°, p3) = (p°, pt). 4-vectors have real components and form a 
vector space. There is a metric tensor guy =g”” that is used to form 
a dual space to the space of 4-vectors. This dual space is a vector 
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space of linear functionals, known as 1-forms (or covariant 4-vectors), 
which act on 4-vectors. For every 4-vector x”, there is an associated 
l-form t= 9px”. Such a 1-form is a linear functional that, acting on 
a 4-vector y”, gives a real number =g,,,2”y". This number is called 
the scalar product* x- y of x” and y” . The Lorentz transformation 
between two coordinate systems, A“,, with «/*!#=A 2”, leaves the 
scalar product unchanged which is equivalent to gpg = gur A”, A”. 

In the standard configuration, the Lorentz transformation becomes the 
Lorentz boost along the first space coordinate direction and is given by? 


y —-yB 0 0 
_| =e yvy 00 
= 0 0 1 0 
0 0 ol 
with 
Vv 1 
B=-, i 
c J1— 2/2 


where v is the velocity of the boost. 

Two Lorentz boosts along different directions are equivalent to a single 
boost and a space rotation. This means that Lorentz transformations, 
which can be seen as space-time rotations, include Lorentz boosts (rota- 
tions by a purely imaginary angle) as well as space rotations (by a purely 
real angle). Representing Lorentz transformations by 4-dimensional real 
matrices acting on 4-vectors is not well suited for combining Lorentz 
boosts and space rotations in a transparent way. Even a simple question 
like ‘What is the single space rotation that is equivalent to a combination 
of two arbitrary space rotations?’ is hard to answer. 

A better way is to represent Lorentz transformations by 2-dimensional 
complex matrices. First we consider a 3-dimensional real space and ro- 
tations. With every rotation in that 3-dimensional real space we can 
associate a 2 x 2 complex matrix, called a spin matrix,® 


1 1 
R = cos (50) +isin (50) (a, cosa + ay cos B + oz cos 7) 


Roce (50) pin (50) tee) 


= opli (58) (n- 0) 


or 


(6.1) 


where @ is the angle of rotation, a, 8, y are the angles’ between the axis 
of rotation n and the coordinate axes, and © = (Cx, 0y, 0z) are the Pauli 
matrices. Note that R is unitary: R = R-t. The vector space of spin 
matrices (a subspace of all 2 x 2 complex matrices) is thus defined using 
four basis vectors, such as the unit matrix and three basis vectors formed 


6.1 Special relativity 141 


4A similar situation occurs in the 
infinite-dimensional vector space of 
states in quantum mechanics (with 
complex numbers there). For every 
state, represented by a vector known as 
a ket, for example |x), there is a 1-form 
known as a bra, (|, that, acting on a 
ket |y), gives a number (a | y), which 
is called the scalar product of the two 
kets |x) and |y). 


©The metric is represented by the 
matrix 


1 0 0 0 
0 -1 0 0 
I~) 0 -1 0 
0 © © =i 


6 Also known as Hamilton’s quaternion 
or a spinor transformation or a rotation 
operator. 


TOnly two of the angles a,{,y are 
independent. 
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8 This representation of rotations is 
used in video game programming be- 
cause of this ease of combining rota- 
tions quickly. 


on, is the unit vector along the direc- 
tion of the Lorentz boost. 


10Now with four matrices as the basis 
and X satisfying X = Xİ by definition. 


11 gpinors are vectors in the mathem- 
atical sense since they form a complex 
vector space, but they are not vec- 
tors like a displacement x, because 
they transform (e.g. under rotation) 
differently. 


using the Pauli matrices: iog, ioy,ioz. In this basis, the spin matrix R has 
the coordinates cos($0), sin($@) cosa, sin(40) cos 8, and sin(40) cosy. 
Combining two rotations, one multiplies corresponding spin matrices 
and describes the outcome using the above basis; thus getting all the 
parameters of the equivalent single rotation. The next step is to as- 
sociate each 3-dimensional space (real numbers) vector x = (x1, x7, x?) 
with a corresponding spin matrix (there is no unit matrix in the basis 


here—only the three Pauli matrices): 
X=2'0,4+270,+2°0, (6.2) 


Then, under the space rotation, x is transformed to x’ and_X is trans- 
formed to X’ = RX Rt = 20, +20, +2%0,, from which we can read 
the coordinates of x’. 

The beauty of this approach is that it extends seamlessly to space- 
time rotations, i.e. to the Lorentz transformations. The spin matrix R 
of eqn 6.1 becomes the Lorentz transformation 


TS [e Vibe) ; | (6.3) 


where p = pn, is the rapidity.? The rapidity is related to the Lorentz 8 
and y parameters by 


tanhp= 6, coshp=vy, sinhp = py 


Now, a combination of two Lorentz transformations is very 
transparent—just addition of real and imaginary parts in the exponent. 
Association of a 4-vector z” with a Hermitian spin matrix!? X, 


X =x? + xlo, + T’ Oy + 2°00, (6.4) 


allows us to get its Lorentz-transformed coordinates from X’ = LX Lt = 
x +a%o,+2"o0,+2%0, Finally, the Lorentz boost alone (0 = 0) along 


np is 
1 1 : 1 
1 = exp(-p: 5°) = cosh ($0) ~ ny: sinh (50) (6.5) 


6.1.1 Spinors 


Spin matrices can act on 2-component complex vectors called spinors. 1! 
Spinors, like vectors and tensors, are used in a number of different areas 
of physics, including classical mechanics. They play a particularly im- 
portant role in RQM and in this section we will describe them in some 
detail. Under a space rotation R, a spinor € (€° to be more precise) 
transforms in the following way: 


E! = RE 


For comparison, the coordinates of a vector x transform under a space 
rotation as 


X'’=RXR = xo, + eo, +o, 


Thus, in a rotation of the coordinate system by 0 = 27, R = —1 because 
the 40 in R gives €’ = —€ and x’ = x. Continuing the rotation by a 
further 27, so all together by 47, results in ¿€ = €. Does that counter- 
intuitive minus sign resulting from the 27 rotation have any physical 
significance? Yes it does, as was demonstrated in a beautiful experiment 
[122] using neutrons. 

One of two coherent neutron beams passes through a magnetic field of 
variable strength. In the magnetic field, the neutrons’ magnetic moments 
precess with the Larmor frequency and the angle of the precession is 
easily calculated as a function of the strength of the magnetic field. After 
passing through the magnetic field, the beam interferes with the second 
beam, which followed a path outside the magnetic field. As demonstrated 
in Fig. 6.1, an angle of 47 is needed for the neutron wavefunction to 
reproduce itself. A 27 rotation gives a factor —1 in front of the original 
neutron wavefunction, as predicted for a spin-5 spinor. 

So far, one could think about spinors as being identical with the 
Pauli spinors!? of non-relativistic quantum mechanics (NRQM). This 
is not quite right. The reason is that the Pauli spinors of NRQM live 
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12 Rigenstates of spin operators, like the 
spin projection on the z axis, shoz, for 
a spin-4 particle in NRQM [94]. 


Fig. 6.1 A phase change of 47 is 
needed to get the same intensity from 
the interference of two neutron beams. 
Taken from [122]. 
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in space and not in space-time and we do not know how to Lorentz- 
13Hermann Weyl, 1885-1955. transform them. What we are constructing now are Weyl! spinors (there 
is more than one type) living in space-time, and we do know how to 
Lorentz-transform these. Weyl spinors are needed to construct Dirac 
spinors, or bispinors, since two Weyl spinors of different type are needed 

for one Dirac spinor. 
We will look now at the spin matrices X of eqn 6.4 from a different 
14A tensor product or outer product or viewpoint, seeing them as tensors created by a tensor product!* of 2- 
dyadic product—something like this: dimensional spinors. Weyl spinors are rooted in space-time, not only in 
(:) lea) = a ‘ space like Pauli spinors. Consider the Lorentz transformation L of a spin 

C = A 


b be bd matrix X built from spinors € = (5) and 7 = o 


X'= e (e d') =1(%) (c d) Lt =LXI' 


We can see that €’ = LE but 7! = L*n (after taking the transpose, L™ = 
L*). There are two different types of spinors, transforming differently. 
Those that transform with the complex conjugate L* are called dotted 
Weyl spinors, distinguished from the undotted é“ by a dot written above 
the index: 7; for example, (€°)* is a dotted spinor. The spin matrix X 
is then written as X°? (a = 1,2 and B=1, 2). There is a metric tensor 


S 0 1 
astei n 


(and an identical one for the dotted spinors) to create a dual space of 
1-forms: 


Ea = Eagt? (and Nà = Ea) 


Note that this gives €°€, = 0 and (thus €; = €? and £2 = —€"). The scalar product!® 
Ea = —EaG%. 


E Ge = Eagt? 


(and similarly for the dotted spinors) is invariant with respect to the Lor- 
entz transformation. Because undotted Weyl spinors and dotted Weyl 
spinors are different objects, the scalar product, or in general any con- 
traction, can only be performed on the same type of spinors: an undotted 
index is contracted with another undotted index and a dotted index is 
contracted with another dotted one; one cannot contract a dotted index 
with an undotted one. 

In order to gain more insight, we go beyond Lorentz transform- 
ations and consider space inversion, P: P(x°,x) = (x°, —x). The space 
inversion P commutes with space rotations, but not with Lorentz trans- 
formations, because Lorentz transformations affect the time component 
and P does not. To illustrates this, consider a boost A followed by a 
space inversion P in 4-dimensional space-time, PA. It is evident that 
this is equivalent to a space inversion P followed by a boost, PA = A’P, 


but A Æ A’: if A is a boost with velocity v, then A’ is a boost with 
velocity —v. Thus [P, A] 4 0, and therefore P is not proportional to the 
identity operator, which commutes with every operator. 

We now return to Weyl spinors. Because space inversion is not pro- 
portional to the identity operator, space inversion does not transform 
E“ into €% times a number. It transforms €% into a spinor of a differ- 
ent type, which transforms under the Lorentz transformation differently 
to é~. Just as Pauli spinors represent spin in NRQM, Weyl spinors are 
going to represent spin in RQM. We know that space inversion leaves 
spin unaffected, and therefore, under P, ~ needs to be transformed to 
a spinor that transforms under space rotations in the same way as €° 
and represents the same spin state.!° Out of all three possibilities, only 
the 1-form 74 transforms in the same way. So, under space inversion, 
E% — na and ng — €%. In the discussion on space inversion, P? = 1 is 
assumed. This is fine for all particles except Majorana particles. For a 
Majorana particle, P? = —1 and the transformation of spinors under P 
is different to that given here. We will define a Majorana particle later. 

As the spinors €% and nq play a very important role in RQM, the fol- 
lowing is a summary of how they behave under various transformations 
(note here that (Z1)~! = eL*e7!): 


rotation R : E + REY, Na > Rye (6.6) 
Lorentz transformation L : E — LE, na + (Li) tna (6.7) 
Lorentz boost LÍ = L: eas LES, na > L7tna (6.8) 
space inversion P : E% > Ma, Na > E“ (6.9) 


Suppose there is a spin-5 particle with 4-momentum p” described in a 
particular reference frame by p°® via eqn 6.4.17 Note that p°® is identical 


to p’®; it is not a transpose operation. Following our non-relativistic 
intuition gained from using Pauli spinors, we want to represent the spin 
of that particle by €°. In an attempt to write a covariant!® equation, 
we could try to contract the undotted index a, but that would lead to 
something like 
Papé = mng 

where ng isa dotted spinor different from €“ related to the uncontracted 
dotted index, and m is a dimensionful scalar!? parameter appearing 
because of the energy dimensionality of p”. So the equation is covariant 
only when m = 0, because we do not have any dotted spinor in hand to 
put on the right-hand side. A similar outcome is obtained if we have only 
ng instead of €°. Having a column vector with two complex numbers is 
not enough; we also need to indicate how the two numbers transform 
under Lorentz transformations. Similarly, for vectors in 3-dimensional 
space, three real numbers are not enough; we need to know whether 
they represent a polar or an axial vector, since these transform differently 
under space inversion. So insisting on only one type of spinor excludes 
the other type, because they transform differently. 
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16For a more complete treatment of 
the material presented here on spinors, 
Chapter III of [50] is recommended. 


l7We have 


P = pas = p° + p° 
papir 
p”? = -pj =p" — ip? 


p” = -p =p" + ip? 


18 Here and in the rest of this chapter, 
covariant means covariant with respect 
to Lorentz transformations. 


19Henceforth, scalar means scalar with 
respect to Lorentz transformations. 
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201p quantum mechanics, a helicity op- 
erator representing the projection of a 
particle’s spin on the direction of its 
momentum is defined as 


p:o 
|p| 


For a massless particle, this is equiva- 
lent to 


p? 


2lWe have pp, o = ppp os. 


22D, A. M. Dirac, 1902-1984. 


23We have 
cosh(p/2) = Ekm 
2m(E + m) 
; |p] 
h(p/2) = 
sinh(p/2) = 


Consequently, we end up with two independent Lorentz-invariant Weyl 
equations: 


(P —p-o)€ =0 (6.10) 
(P + p-o)n =0 (6.11) 


Pops = 9, 
png = 0, 


In the context of RQM, eqn 6.10 represents an equation of motion for 
a free massless spin-4 particle with positive helicity?” and eqn 6.11 an 
equation of motion for a different free massless spin-4 particle with nega- 
tive helicity. Each equation is not covariant under space inversion and 
violates parity because the space inversion, eqn 6.9, sends each spinor 
beyond the formalism—only one type of spinor is present in each formal- 
ism. At present, there are no known particles that could be described by 
either of the Weyl equations. If the electron neutrino were exactly mass- 
less, it would be described by eqn 6.11 and the hypothetically massless 
and different electron antineutrino would be described by eqn 6.10. 

Suppose now that we have pee and two different spinors é% and ng to 
describe a spin-4 particle. First, we contract the undotted index, giving 
Papé“ = Mng, then acting with?! p°? on ng from that equation gives 
m€® under the condition that m? = p" py. The result is a covariant set 
of equations 


png = me, (p+ p-a)n=mé 


6.12 
(po — p: o)€ = my oe 


Pops = MNg, 


Requiring that, under space inversion, é“ and nà be transformed into 
each other as in eqn 6.9 makes the set of equations 6.12 invariant under 
space inversion, because, simultaneously, p°? and Paġ are also trans- 
formed into each other. The spinors ¿“ and ng are combined into a single 
four-component bispinor called the Dirac?? spinor and the two equations 
become one equation called the Dirac equation. In the context of RQM, 
the Dirac equation describes a spin-4 particle like the electron. 

In order to gain more insight into the origin of the Dirac equation, 
consider the Lorentz boost, eqns 6.5 and 6.8, from the rest frame, mo- 
mentum p = 0, to the frame in which the particle has energy E and 


momentum p. The relevant spinors transform as?’ 
€(p) = [cosh(p/2) — np - osinh(p/2)]E(0) (6.13) 
n(p) = [cosh(p/2) + n, - o sinh(p/2)]n(0) (6.14) 
which can be written as 
_E+m+p-0 
£(p) = TRETA £(0) (6.15) 
_E+m—p-o 
n(p) = amt m) (6.16) 


In the particle’s rest frame and in all frames moving with respect to it 
slowly enough that the Lorentz boost can be approximated by a Galilean 
transformation (not affecting time) when transforming between those 
frames, any differences in how spinors with dotted or undotted indexes 
transform disappear. In that case, spinors effectively live in 3 real dimen- 
sions. Inspecting eqn 6.5, we can see that in the limit 6 — 0, L tends to 
the unit matrix and therefore, under the Galilean transformation, spinors 
do not change. Thus, at rest, both Weyl spinors, €° and Nà, become 
effectively identical to the same Pauli spinor and we can write €°(0) = 
na(0). This allows us, after some algebra, to remove p = 0 spinors from 
eqns 6.15 and 6.16 and to obtain the Dirac equation, eqn 6.12. 

Thus the Dirac equation is equivalent to the Lorentz boost. This 
should be expected—once an object, like a bispinor, is found to rep- 
resent a particle in its rest frame, the only thing left to do is to boost it 
to another frame as needed. 


6.2 One-particle states 


The fact that quantum states of free relativistic particles are fully de- 
fined by the Lorentz transformation supplemented by the space-time 
translation was discovered by Wigner.?4 Here we will follow his idea in 
a qualitative way just to get the main concept across. 

First, we note that Lorentz transformations are not able to transform 
a given arbitrary 4momentum p” into every possible p”. Instead, the 
vector space of 4-momenta is divided into subspaces of 4-momenta that 
can be Lorentz-transformed into each other. Three of those subspaces 
represent experimentally known states. The simplest, at this stage, is 
the vacuum state given by the conditions p” = 0 and p“p, = 0. There 
is no Lorentz transformation that would transform a 4-momentum not 
satisfying these conditions into one that does, and vice versa. We will 
not study vacuum states in this book, and therefore we move directly to 
consider two other possibilities. 

A 4-momentum subspace related to massive particles, like the elec- 
tron, is given by the condition pp, > 0. In addition to a 4-momentum 
p”, what other degrees of freedom are present and which geometrical 
object represent them? To answer this question, we can consider Lor- 
entz transformations that leave p” invariant.?° To see what these are, we 
can transform p” to the particle rest frame, where p'” = (mass, 0,0, 0), 
find the largest subset of Lorentz transformations leaving p'” invariant, 
and then transform back to the same p”. It turns out, as intuitively 
expected, that the desired transformations are space rotations acting 
on 2s + 1 spinors representing 2s + 1 spin projections of a spin-s par- 
ticle. Thus, the electron, s = Z, is represented by two Dirac spinors—in 
fact, by two Dirac spinors multiplied by a dimensionless scalar. To get 
the scalar, we add space-time translations. Looking for a theory that is 
space-time translation-invariant,? we are looking for the free-particle 
energy and momentum eigenstates that, in the position representation, 
lead to the scalar exp(—ip"z,,). 
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24 kugene Wigner, 1902-1995. 


25 The group of such transformations is 
known as the little group. 


26Implying energy and momentum con- 
servation. 
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27For a discussion of some subtle issues, 
see [107]. 


The third 4-momentum subspace is defined by the conditions p” Æ 0 
and pp, = 0. Photons belong to this class. The question is again to 
find the largest subset of Lorentz transformations leaving p” invariant. 
There is no rest frame in this case, and therefore, instead, we transform 
an arbitrary p” to the frame where p'” = (w,0,0,w). We can see that the 
largest?” subset of the Lorentz transformations leaving p'” invariant are 
the rotations in the (+, £?) plane. As a result, a spin-s massless particle 
is represented by only one state, a helicity eigenstate, and not by 2s + 1 
states as in the massive case. This is an important difference. In order 
to get parity-conserving electromagnetism with photons having either 
helicity + or helicity — states, we put those two, in principle different, 
helicity states into one theory. 


6.2.1 Fields and probability amplitudes 


We have now everything needed to develop RQM and to describe fun- 
damental particles and their interactions. But before we move on, we 
pause to look at a larger picture of which RQM is only a part. The 
Dirac equation, for example, can be studied in the context of classical 
field theory or RQF or RQM. The algebra will often be identical, but 
the basic objects and the interpretation are different. 

The most natural way to proceed from here would be to study a 
classical field theory. The paradigm for this is classical electromagnet- 
ism described in terms of the tensor field F#” or the 4-vector potential 
field A”: 


FH = aH AY — 0" A” 


Then such a field, for example a classical electron field, i.e. a classical 
Dirac spinor field Y, would be quantized, promoting the W of classical 
field theory to an operator V of RQF. Probability amplitudes would be 
obtained by taking matrix elements of V of RQF sandwiched between 
particle states living in a suitably constructed space. In RQM, W rep- 
resents a particle state and is not an operator. Taking the vacuum to 
one-particle matrix element of the field operator Y of RQF, we get W of 
RQM, which in the position representation is called the wavefunction. 
In RQM, one can also have states describing many particles—but only 
a fixed number of them. At high energies, much larger than the masses 
of the particles involved, particles can be created and the number of 
particles cannot be fixed; RQM is not adequate for this and RQF has to 
be used instead. 

It is not appropriate in this text to go into sufficient detail to enable 
a proper understanding of classical field theory and RQF. Fortunately, 
considering the most important aspects of physics that we require, RQF 
gives the same results as those we will obtain in RQM. Differences will 
be in details beyond leading effects. The one important exception is 
that we will be missing the idea of a vacuum state. In RQF, a vacuum 
is not a ‘nothingness’, although particles are absent. For example, the 
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QCD vacuum is a very complicated state. We will, however, consider the 
nature of the electroweak vacuum in Chapter 12, since it is fundamental 
to the origin of mass in the Standard Model. 


6.3 The Klein—Gordon equation 


RQM of spin-0 particles was considered by Schrödinger first, before he 
published his famous equation for the non-relativistic case. He aban- 
doned RQM because of formal difficulties that were only understood 
many years later. Here, we will see what they are and then define an 
area of applicability of RQM. 

As argued earlier, a spin-0 particle with 


pp, =m? >0 (6.17) 28We have 
in the position representation is expected to be described by a scalar p" = ior 
wavefunction ~ exp(—ip"z,,). Replacing the energy by i0/0t and the mo- p= 32 
mentum by —iV in eqn 6.17,75 we get the Klein—Gordon (KG) equation?’ Ox ðt P 
of RQM in the position representation: p = id" ið; er 
x? 
( + m)U(E, x) =0 (6.18) 29Sometimes known as the Klein- 
Gordon—Fock equation. 
where 
82 
— tð, = — V2 
ONO, Je V 


For a particle at rest, —iVW(t,x) = 0, only the time (proper time 7) 
derivative would be present in eqn 6.18 and there would be two inde- 
pendent solutions: U+(7,x) = exp(+imr)W* (0,0). Therefore, in a frame 


in which the particle has momentum p and energy Ep = +y p? +m? > 0 
(the subscript ‘p’ is for the plus sign, indicating that Ep is positive), we 
get, as expected,°° 30We have mr = Peti: 


W* (t,x) = Nexp(—ip- x) = N exp(—iEpt + ip: x) (6.19) 


where N is a normalization constant that will be defined shortly. 

Instead of boosting the other solution, i.e. taking the (— Ep, p) eigen- 
state, we take the complex conjugate of Ut (t,x), corresponding to the 
(=a: —p) eigenstate, to get?! 31. Why we are doing this should become 
clear after reading Section 6.3.1. 


Ww (t,x) = Nexp(+ip- x) = N exp(+iE pt — ip- x) (6.20) 


By a direct substitution, one can check that a general solution of 
eqn 6.18 is indeed a linear combination of U*(t,x) and W~ (t,x). 

We have obtained, as expected, Wt (t,x), but, in addition, we also 
have Y(t, x). This is the first puzzle of RQM, the nature of which will 
become clearer when we progress a little further. Both solutions of the 
KG equation are eigenfunctions of the energy operator i0/0t: Y+ (t,x) 
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32This can be proved from the Klein— 
Gordon Lagrangian and Noether’s 
theorem. 


33-There is no problem here, since p is 
the time-like component of a 4-vector. 


with an eigenvalue E, and UW (t,x) with —E,, a negative energy for a 
free particle! 

In exactly the same way as for the non-relativistic Schrodinger equa- 
tion, we can derive the continuity equation for a probability density p 
and a probability current j: 


Op s 
a tV¥:i=0 (6.21) 
where 
. OV oy” 
j=—i(W* VU — VVU") (6.23) 


The probability current turns out to be given by the same expression 
as in the non-relativistic case, but the probability density is different, 
although it shows a nice symmetry with the current, and we can define 
a 4-vector current 


j" = (p,j) =i(Wra"w — war") 


The continuity equation, eqn 6.21, can be then written as 0,,j" = 0. The 
corresponding conserved quantity is the total probability, which we ob- 
tain by integrating j° = p over the 3-dimensional space. The underlying 
symmetry is invariance with respect to multiplication by a global phase 
factor: physics described by W is identical to physics described by ei? Y 
for any fixed real parameter 0.3? 

Substituting Ut (t,x) from eqn 6.19 into eqn 6.23, we obtain 


pt =2|NPE, j" =2\NPp (6.24) 


Now we can fix the normalization N. In NRQM, the volume integral 
of the probability density is a constant with value 1 for one particle in 
the whole space. This does not work in RQM, because of the Lorentz 
contraction, which modifies the volume, contracting one side of a cube, 
parallel to the Lorentz boost, by the Lorentz factor y. To keep the inte- 
gral independent of the Lorentz transformation, the probability density 
should grow by the same factor?’ y. So putting N = 1 would do the job, 
as would any other constant. The choice of N = 1 is called the covariant 
normalization and corresponds to 2E, particles in a unit volume. An- 
other popular choice is N = 1/ 2m, which in the non-relativistic limit 
Ep — m makes p + Y*Ų and j > velocity approach the expressions 
from NRQM. 
For Y~ (t,x), eqn 6.24 becomes 


p= -2|NPB, j7 =-2|NPp (6.25) 


In summary, &* (t,x) and related observables, the energy, the prob- 
ability density, and the probability current come out as expected and 


behave nicely in the non-relativistic limit. In contrast to Y+(t, x), an 
unexpected additional wavefunction W~ (t,x) describes a free particle 
with negative energy and negative probability density and with the 
probability current flowing in the opposite direction to the particle’s 
momentum—all properties that are unexpected and difficult to accept. 


6.3.1 The Feynman-—Stueckelberg interpretation 
of negative-energy states 


In this section, we outline the Feynman—Stueckelberg interpretation of 
negative-energy states following the approach described by Feynman in 
his Dirac Lecture [74], which is recommended as further reading. 

Suppose there is a particle in a state ¢o as indicated in Fig. 6.2(a). 
At time tı, a potential U; is turned on for a moment, acting on the 
particle and changing its state to an intermediate state. At time te, a 
second perturbation U2 changes that intermediate state to the final one, 
which could be the same as the original state do. The amplitude for 
the particle to go from the initial state ¢9 to the same state ¢o after 
time t2 has a contribution from the amplitude with an intermediate 
state, existing for the period of time from tı to t2, of energy Ep > 0. 
All possible intermediate states of different energies Æp > 0 contribute. 
Among them, there are amplitudes for particles travelling faster than 
the speed of light. This is the result of insisting that all energies Ep 
be positive. If one starts a series of waves from a point, keeping all 
energies positive, these waves cannot be confined to the inside of the light 
cone. The sketch in Fig. 6.2(a) corresponds to an amplitude where the 
particle in the intermediate state travels faster than the speed of light. 
An observer in the reference frame (a) with coordinates (t, x) observes 
one particle in quantum state o that moves from x; to x2, ending in 
the same quantum state ġo. 

As indicated in Fig. 6.2(b), there is another reference frame (b) with 
coordinates (t’,x’) in which the sequence of events is different; t hap- 
pens first, before t}. An observer in this reference frame has a different 
story to tell. A particle at x/ is in a quantum state ¢9. Nothing happens 
until time t$, when suddenly two particles emerge from the point z$. 
One of these particles travels to x, and at time tį collides with the 
original particle. The particles annihilate with each other, disappearing 
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Fig. 6.2 A contribution to the transi- 
tion amplitude viewed in two different 
reference frames. Adapted from [74]. 
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34A proper discussion of interactions 
will be given in Section 6.5. 


35We have 
Ow 
“at 
W(t, s) = W(s) exp(—iEpt) 


= Epv 


from the scene, leaving the third particle at x in state ġo. In frame 
(b), three particles were present between t, and t4. The second observer 
can argue that the particle that travelled from x4 to x4 is the antipar- 
ticle of the original particle and therefore they were able to annihilate 
with each other. So antiparticles must exist and their properties are 
defined from particles such that the annihilation works. The first obser- 
ver might argue that the antiparticle of the second observer is her/his 
particle travelling backwards in time and that is the interpretation of 
the negative-energy states. The negative-energy states correspond to 
particles travelling backwards in time and therefore the phase of the 
wavefunction in eqn 6.20 has a —iEp(—t) = +i£pt contribution instead 
of —iEpt as in eqn 6.19. To make the picture complete, we must also 
take into account that a particle travelling backwards in time has its 
momentum reversed. Mathematically, all this is equivalent to taking 
the complex conjugate of the positive-energy solution Wt (eqn 6.19) 
to obtain the negative-energy solution Y7 (eqn 6.20). 

In summary, negative-energy solutions of the KG equation represent 
antiparticles. The probability density represents the charge density and 
can be either negative or positive. The same applies for the probability 
current, representing the charged current, the number of charges passing 
through the unit area per unit time. 


Inclusion of interactions via a potential 


We introduce interactions using a potential and see what happens. 
Following the way in which a potential V ia introduced into the 
non-relativistic Schrödinger eqn, we modify the energy operator®* 


transforming eqn 6.18 into 
a 2 
E = v) U = (-V? + m7)v 


which, for a time-independent, time-like potential V in one dimension 
and for energy eigenstates with energy Ep, becomes?” 


[Bp = Vul) = -a +m?) vO) (6.26) 


Consider a time-like potential barrier of fixed height V > 0 for s > 0, 
as shown in Fig. 6.3. The wavefunction 7(s) consists of incident, [e!?*, 
reflected, Re~!?*, and transmitted, Tets, waves: 

prls) = Ie”: + Re~i”: 

dr(s) = Te" 
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where 


— jyr(s) for s <0 
a he fors > 0 


On substituting wy, and wr into eqn 6.26, we obtain E? = p? + m? and 
(E—V)? =k? + m?, leading to 


p = y E? — m?, k = +y (E — V)? — m? Revs T eiks 
In both cases, we choose a + sign in front of the square root to match , 
7 á res => y : 


the expected propagation directions as in Fig. 6.3. 


From the continuity condition at s = 0 for (s) and dy(s)/ds, we get S 
I+R=T pl — pR = kT Fig. 6.3 A time-like potential barrier 
i of height V. Incoming, reflected, and 
amdo transmitted waves are also indicated. 
2 -k 
Ze = "r (6.27) 
p+k p+k 
The probability currents along s for s < 0 and s > 0 are 
. p ' k 2 
ji = =H? — |R?) jr => |T] (6.28) 
m m 


Keeping the energy E fixed, we consider three different cases of 
potential strength. 

The first case is that of a weak potential, E > V +m, where k is real 
and k < p. The probability densities in the two regions are 


E E-V 
pu = =L? > 0, — pr = =— lyr]? > 0 (6.29) 
m m 


This case looks like the non-relativistic one, with nothing special hap- 
pening: a small fraction of the incoming wave is reflected and the rest is 
transmitted. 

In the second case, the potential is of moderate strength, V — m < 
E< V +m, and k =iym? — (E — V}? = ix is purely imaginary (with 
k real) and 


p—ik 


~ ptik 

sO 

lR|= Ml,  j=0 
The incoming wave is totally reflected and the probability density in the 
barrier shows the expected exponential decay 

E-V E-V 

= ——e 
m 


m 


—2Ks 


PR eel? = 
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36 This is known as the Klein paradox, 
another puzzle of RQM, after O. Klein, 
1894-1977. 


as in the non-relativistic case. However, the situation is not identical, 
because the increasing potential V changes the sign of the probability 
density from positive (pp > 0 in the first case of the weak potential) to 
negative: 


E>V > pr>d 
but 
E<V > <0 (6.30) 


We will come back to this after discussing the case of the strong po- 
tential, E < V — m, when k becomes purely real and k? > p?. The 
probability, which became negative when the potential became strong 
enough in the previous case, stays negative and the probability current 
becomes real inside the barrier: 


E-V 
PR = 


, k 
IT? <0, jrn=—|TP 
m m 
Consider next the previously unphysical case of k < 0, because, in a 
counter-intuitive way, this case now corresponds to a particle moving to 
the right. To see that, we will calculate the group velocity 


OE k 


ok TEV?’ 


Ve = 
A consequence is that T and R given by eqn 6.27 can be arbitrarily large, 
possibly making the reflected current bigger than the incoming one.*° 
The reason for this is that the very strong potential provides enough 
energy to produce particle-antiparticle pairs. The probability current 
and the probability density within the barrier are negative because cre- 
ated antiparticles (see eqn 6.25) are attracted to the barrier, moving to 
the right (V > 0). The created particles are repelled by the barrier, 
moving to the left, increasing the reflected probability current. Such a 
situation can be created by focusing light from a high-power laser, mak- 
ing a very strong electric field, which in turn produces electron—positron 
pairs from the vacuum. Another example is the Hawking radiation in 
the neighbourhood of a black hole. The fact that the probability density 
already became negative in the case of a moderate-strength potential 
corresponds to vacuum polarization by the creation of virtual particle— 
antiparticle pairs. They do not affect the probability currents, because 
there is not enough energy in the system to promote them to become 
real particles. An analogy is the Lamb shift in atomic physics, where 
the vacuum polarization affects energy levels. The problem of RQM is 
now clear: the formalism describes one particle (or a fixed number of 
particles), but physics needs many particles, the number of which can- 
not be fixed—particles can be created and particles can be annihilated. 
One needs RQF to describe such physics. RQM can only be used as 
long as the number of particles is fixed. Using the uncertainty relation 


ApAs ~ h, we see that pair creation that starts at Ap ~ mc sets the 
limit on As ~ h/mce. So, as long as we are studying physics at a scale 
bigger than i/mc, known as the Compton wavelength, RQM can be 
applied. Atomic physics is an example where this condition is usually 
fulfilled: the Compton wavelength of the electron is about 3.86 x 10713 m. 
But RQM can also be applied to many processes in high-energy particle 
physics. A representative example is electron—positron annihilation pro- 
ducing hadrons, many of which are pions. The fundamental process in 
this case is ef +e~ — q+q. The number of particles is 2 and is fixed, 
and the change from 2 leptons to 2 quarks can be handled by RQM. 
Fragmentation of quarks to hadrons takes place on a different, much 
slower, time scale and therefore can be separated from the fundamental 
process of et + e~ annihilation. 

Is there any limit on Ap? How well can one measure momentum? In 
NRQM, momentum can be measured with any precision, but, because 
of the upper limit on speed, < c, we must have?” 


ApAt ~ R 
G 


and consequently infinite precision Ap — 0 requires infinite measure- 
ment time t > oo. 


6.4 The Dirac equation 


This is the most important section of this chapter. The Dirac equation 
provides a relativistically consistent equation describing a massive point- 
like spin-5 particle such as the electron and it led to the prediction of 
the positron—the first antiparticle. 

First, we consider different representations of the Dirac equation, the 
probability current and bilinear covariants. Then we find the free-particle 
states, examine their properties, and introduce chirality and helicity 
operators. The formalism is then applied to describe simple Standard 
Model processes such as e+ + e~ — u™ +u” at energies well above the 
muon rest mass but well below the Z° mass.°° 

The properties of Dirac particles under the discrete symmetries P, C, 
and T are then discussed. Electromagnetic interactions are introduced 
via so-called minimal coupling and the non-relativistic limit is obtained, 
leading to the prediction of g = 2 for the magnetic dipole moment of the 
electron. Finally, there is a brief discussion of the Aharonov-Bohm effect 
and the pre-eminence of the electromagnetic 4-vector potential in RQM. 

Before we move on, we shall give a few words of introduction for 
readers who skipped Section 6.1. Unlike the non-relativistic case, where 
electron spin is described by a column of two complex numbers, called 
the Pauli spinor,®? in RQM one needs a pair of two-component spinors 
£% and ng (with a = 1,2 and ĝ = 1,2) called the undotted and dotted 
spinors (the latter being distinguished by a dot above the index)—these 
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37 See the Introduction in [50]. 


38The domain of the ete- colliders 
PETRA (DESY) and PEP (SLAC). 


39 Following experimental observations 
suggesting that the electron has a 
property called spin, Pauli extended 
the Schrödinger equation describing the 
electron’s interaction with an electro- 
magnetic field by inserting into it a 
two-component spinor and correspond- 
ing magnetic dipole moment. 
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49Tn some books, €“ and ng swap 
places, leading to different space-like ~y 
matrices, multiplied by —1. 


42 Rouation 6.32 is 

ow ow 
0 TA iy® : 
Ot Oxk 
Applying f, we get 
owt 
i 

Oxk 
and multiplying by y? from the right 
and using eqn 6.35 gives 


ov, ov, 
rae? tine? + mv = 0 


mv =0 


iy 


i—~,° 4 (-y") +mvit =0 


are just names, and they could be called ‘apples’ and ‘pears’ since they 
are as different as apples and pears. The reason for the different names 
is that the corresponding spinors transform differently under Lorentz 
transformations (see eqns 6.7 and 6.8). Combining é“ and ng into one 


4-component column gives the Dirac spinor Y: UV = e} Now this 


Dirac spinor can be transformed using a unitary operator (changing 
what is called a representation), mixing € and 7. And we must follow 
these different € and 7 spinors all the way through to know how to 
Lorentz-transform the resulting Dirac spinor; having just the four com- 
plex numbers constituting the Dirac spinor is not enough to know how 
to apply the Lorentz transformation to them. 

The Dirac equation, as derived in Section 6.1 as eqn 6.12 or, for readers 
who skipped that section, as derived in Exercises 6.2 and 6.3 following 
the historical path, can be written as 


0 Ppotp:o 
Y = mY 
Po- p'o 0 


Instead of Y, we could use Y’ = UW, where U is a unitary operator. In 
the new basis, the new representation, eqn 6.31, would look different. In 
general, the Dirac equation can be written as 


(6.31) 


(yp — m)v =0 (6.32) 


where 


sO 4 3 
= "Pu = Poy — Py =is +iy-V (6.33) 


Comparing equations 6.32 and 6.31, we can see that in the representation 
that was used to get eqn 6.31, 


0 1 
o 


This is known as the Weyl or symmetric or chiral representation. 
Multiplying eqn 6.32 by yp from the left, we get representation- 
independent constraints on the y matrices: 


(6.34) 


0 


ey ey yey =a (6.35) 


The matrix 7° is Hermitian and the matrices yË are anti-Hermitian (in 


any representation):*! 


pra, s-r (6.36) 

Applying Hermitian conjugation to eqn 6.32, using properties of the y 
matrices given by eqn 6.36 and after some algebra,*? we get the adjoint 
Dirac equation 


U(yp +m) =0 (6.37) 


where the adjoint spinor is 
v=v',° (6.38) 


and p acts on the left. E 
Multiplying eqn 6.32 by W from the left and eqn 6.37 by Y from the 
right and adding the resulting equations, we get 


PAY + (A V) Y = ð (VY) = 0 


which is the continuity equation, 0,,j" = 0, for the probability current 
4-vector 


jh = Yyy (6.39) 
The probability density 
4 
p= j’ = 0Y =X |t]? (6.40) 
i=1 


is the time-like component of the probability current; it is positive- 
definite and has a similar form to the non-relativistic expression. Y and 
WV may be used to form quantities with well-defined space-time trans- 
formation properties—known as bilinear covariants. The simplest is the 
Lorentz scalar VW.43 The next simplest is Vy", which transforms as 
a Lorentz 4-vector. 

The Hamiltonian H of the Dirac equation is obtained by multiplying 
eqn 6.32 by y° from the left and separating the time derivative: 


HW = i (6.41) 
where 
H=a-p+ 8m (6.42) 
and 
a=7y, B=7 (6.43) 


The matrices œ and 8 are Hermitian and in the Weyl representation are 


given bytt 
o 0 0 1 
a 2), A an 


The Weyl representation is very well suited to the ultrarelativistic 
limit, where the mass can be neglected, because the Dirac bispinor is then 
effectively reduced to a single Weyl spinor. In the non-relativistic limit, 
however, both Weyl spinor components of the Dirac bispinor contribute 
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43 The matrix 7° that is inside Ù swaps 
spinors in the bispinor such that the 
dotted index meets the dotted one 
(€*n) and the undotted index meets the 
undotted one (7*€). It should be noted 
that complex conjugation adds or re- 
moves a dot, so if € has an undotted 
index, €* has a dotted index. 


44We have 
aja; + ajay = 263; 
Ba+aB=0 
pP=1 
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equally, so another representation, called the standard or Dirac represen- 
tation, is more suitable. This representation is often used in introductory 
textbooks. The transformation from the Weyl representation to the 
Dirac representation is effected by the unitary transformation 


1/1 1 
"= ( 5) 
which gives 


WU (Dirac) = (2) = UT (Weyl) = U a = 5 (u (6.45) 


The transformation of the y matrices, y(Dirac) = Uy(Weyl)U ~t, gives 


1 0 0 oO 0 o 
tepe 4). r= (So) a= 9) 


The Dirac equation in the (standard) Dirac representation is then 
given by 


Ey — p: ox = my (6.46) 
—Ex + p: op = mx 
In the non-relativistic limit, x — 0 and the Dirac spinor becomes 
effectively a two-component Pauli spinor. 

The fact that in the relativistic theory one needs two Weyl spinors to 
describe the electron while in the non-relativistic world one Pauli spinor 
is enough is always difficult to accept. To give some insight into why 
this is the case, consider yet another representation, that of Foldy and 
Wouthuysen (FW). We start with the Dirac representation and apply a 
momentum-dependent unitary transformation Upw given by 


1 Ba: 
Upw = exp ( pore arctan et) 
2 |p| m 


The wavefunction in the FW representation is then 


(FW) = (2) = Upw¥ (Dirac) = Upw o) 


and after the transformation of the Hamiltonian, eqn 6.42 splits into 
components, becoming 


deem ue (6.47) 
ease (6.48) 


We now have two decoupled equations for positive- and negative-energy 
solutions, respectively. But if we want to drop one of them and consider, 
say, only the equation for the positive energies, then there is a problem, 
because we would not know how to transform the positive-energy spinor 
without any knowledge of the other one.*° In the non-relativistic limit, 


however, yp? +m? ~ m + |p|?/2m, which leads to the Schrödinger 
Hamiltonian, and the Lorentz transformation becomes a Galilean one, 
which does not affect spin. So, in the non-relativistic limit, we can just 
take one equation, for example the one for the positive energy, and 
use it to describe a non-relativistic electron, with its spinor u being 
effectively the Pauli spinor. We can ‘forget’ about the negative-energy 
solution. 


Majorana particles 


This is a short, rather technical, detour from the main track to in- 
troduce the concept of the Majorana particle.*® It is not essential for 
what follows. No fundamental Majorana particle has yet been discovered, 
although it could be that neutrinos are Majorana particles, and a com- 
posite Majorana particle has been discovered in condensed matter. From 
the discussion of spinors in Section 6.1.1, one might get the impression 
that a massless spin-5 particle is described by the Wey] spinor and a mas- 
sive one by the Dirac spinor, which has two independent Weyl spinors 
as components. Although all the particles we know either fit, or could 
fit, this scenario, this is not the only scenario. We can start with one 
Weyl spinor, say, an undotted spinor €“, get a dotted one by complex 
conjugation, and then lower the index with the metric tensor e€ to give 
a spinor that transforms like nà. We can then use these €- and n-like 
objects to construct a Dirac spinor that satisfies the Dirac equation in 
the Weyl representation for a massive particle, the Majorana particle, 
effectively defined by one Weyl spinor rather than by two independent 
Weyl spinors as for the electron. 


6.4.1 Free-particle solutions 


The Dirac equation, eqn 6.46, in the Dirac (standard) representation can 


be written as 
p'o =m X X 
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45We only know how to transform ¿®“ 
and ng and they are buried inside the u 
and w spinors. We would need to get €% 
and 7, out of u and w, transform u and 
w back to €% and Ng do the Lorentz 
transformation, and transform back to 
get the Lorentz-transformed u and w. 


46Named after Ettore Majorana, 1906— 
1938. 


160 Relativistic quantum mechanics 


47We can go back to the Weyl rep- 
resentation, boost € (eqn 6.15) and 7 
(eqn 6.16), and return to the Dirac 
representation. 


481y principle, we could carry on with 
them, as is done in a number of text- 
books. 


There are four independent, un-normalized, solutions for the energy 
eigenstates: 


—imt —imt 
e e 


for E= m 


etimr for E = -m 


oreo ooo Ke 
ES oOo Oo So Oo oS 


By boosting these solutions to the frame where p Æ 0, we get (s = 1, 2)4” 
u®) exp[-i(Ept — p- x)|] =u exp(—ip-x) for E> 0 
and, for the —E, energy eigenstate and p momentum eigenstate, 


ut?) exp(+iEpt) exp(ip- x) = ult?) exp[+i(+E,t+p-x)| for E <0 


where 
g6) ZT: P is) 
u®=N| o-p rome ust? =N | Ey tm 
Ep +m gs) 


N is a normalization constant, Ep = +y p°? + m?, and 


1 0 
üa ia 
melh e 


As in the case of the KG equation, instead of the above two solutions 
for E < 0 energy eigenstates,** we will follow the Feynman-Stueckelberg 
interpretation of negative-energy states as antiparticles that are equiva- 
lent to particles travelling backwards in time. This requires us to replace 
the momentum p by —p and to decide which spin state to choose in going 
from a particle to an antiparticle. 

Taken together, these steps give the four independent solutions of the 
Dirac equation: 


Ut (2) = uHe, P(x) = vetr? (6.49) 
where us) is as above and 
v (p) =u%(—p),  v® (p) =u (—p) (6.50) 


Both Wt, describing a free electron, and Y7, describing a free posi- 
tron, are twofold-degenerate. We need a quantum number (label) to 
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distinguish the pairs of states with the same energy. A suitable operator, 
commuting with the free-particle Hamiltonian, is the helicity operator 


op 
h(p) = eI s (6.51) 


|p] 


In a convenient reference frame where p = (0,0, p), 


1 0 0 0 
0 -1 0 0 
M=19 0o 1 0 
0 0 0 =1 


and, for example, 


p 1 
Ep+m \0 


Thus w*) (as well as v‘)) are helicity eigenstates: 
h(p)u) = 4+u, h(p)ul?? = —7,(2) 


Helicity eigenvalues correspond to the spin component along the dir- 
ection of motion. Helicity + means that, in its rest frame, the electron 
has +4 spin projection on the axis parallel to p (likewise for helicity — 
and the —4 spin projection); 0) and V®) are also defined with respect 
to this axis, which has to be the z axis, given the chosen representation 
of the Pauli matrices. 

We can see now that changing p to —p changes the direction of the 
quantization axis in the rest frame of the electron and the spin projection 
changes sign as in eqn 6.50 (see the s label).*° 49 See [13] for further reading. 

As for the KG equation, we use covariant normalization, requiring 
that the integral of the probability density over the unit volume gives 
2E, particles. This gives the normalization constant as N = ,/Ey +m. 


6.4.2 Chirality Æ helicity 
Chirality, also called handedness, is defined by a pair of projection 


operators”? 50 These satisfy 
l= 1+7 . Pukin =l 
h=- m PMs o where yeh'ryry (6.52) PiuPa=PaPL =o 


PLPL = PL, PrPr = Pr 
They divide the space of wavefunctions into right-handed and left- 
handed half-spaces. Pr projects a wavefunction onto the right-handed 
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51Thus Y = PRU + PLU = UR 4 


YL. 


52 The operators Py, and Pp respect- 
ively pull the dotted and undotted 
components out of the Dirac spinor. 


half-space, giving the right-handed component of the wavefunction. Pg 
does the same with respect to the left-handed half-space and the left- 
handed component of the wavefunction.®! Describing the Dirac spinor 
in the Dirac representation (ignoring the normalization) in terms of the 
Weyl spinors € and 7 of the Weyl representation (see eqn 6.45), we get 


eta Y)GE)-G) 
Eak DEE 


We can thus define left-handed and right-handed spinors as 


ERIE 


m =n, (6.53) 
The projection operators P and PR are important in particle physics 
because weak interactions are sensitive to these components.°? The W 
boson couples only to the left-handed part of a particle wavefunction 
and the Z boson couples to both parts but with different strengths. 

In order to gain better insight, we will consider chiral components of 
the Dirac spinor u: 


Ey +m 


where J = Jt + V` is a superposition of helicity + (V™) and helicity — 

(V7) eigenstates (not normalized). Because 
Pyrat O: P a- = y- 

Ip] Ip] 


it follows that 


kopi iysg Ta 
|p] 


|p] 


SP (1+ SP) yt = 0", oP (1+ oP Fai 
|p] |p] |p] |pl 
Finally, 
o-p a o-p b op 
1 s (ig )o | (1 )o 
( ies) 2 ( |p] 2 |pl 
where 
E IP| . sie P| 
Ey +m Ey +m 


gives a decomposition of the final state into helicity + and helicity — 
components with weights a/2 and b/2, respectively. In summary, 


a eee Die ats 

z = z (helicity +)+ z (helicity —) =u, (left chiral state) 
1 5 

= u= (helicity +) + 5 (helicity —)= ur (right chiral state) 


It should be clear now that chirality and helicity are two different 
things. However, in the limit 


speed >c => a—0 and 6>2 


chirality and helicity become identical, which is the subject of the next 
section. Before that, we will explore consequences, relevant at ener- 
gies where masses cannot be neglected, of chirality and helicity being 
different—for example affecting weak decay rates of particles. As an 
example, we will consider 7~ + u~ + P, decay. 

In the rest frame of the m~, the momenta of u~ and P, are back 
to back and the helicities are as indicated in Fig. 6.4. The 7, is in the 
helicity + state,°? with its spin along its momentum. The u7 has to be in 
the helicity + state to conserve total angular momentum, since the pion 
has spin zero. But the W~ couples to the left-handed component of the 
js wavefunction and therefore the ~~ needs, simultaneously, to have 
helicity + (to conserve angular momentum) and left-handed chirality 
(for the W~ to couple). The probability for this is 


la? 1 i speed 
jal? +b? 2 c 


We can see that as the speed tends to c, the decay rate tends to 0, so the 
m~ cannot decay to a massless u`. This explains why, although favoured 
by the energy phase-space factor (not considered here), the decay rate 
for m7 — e7 + De is much smaller than that for u7 + D." 


6.4.3 Helicity conservation and interactions via 
currents 


It is worth repeating the conclusion of the previous section that chirality 
(handedness) is different from helicity and that this has consequences at 
low energies where masses cannot be neglected. At high energies, where 
masses can be neglected, helicity and chirality can be treated as identical 
and in many textbooks this is the working assumption right from the 
start. 

It can be shown (see e.g. [84]) that in the probability current 


j" = ty*u = (Gi + ür)” (uL + ur) = Gry"ut + Gry" ur 
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Fig. 6.4 m7 —> u` + dy decay. 


53We are not going to consider the 
helicity — state, since such a state has 
never been observed. A neutrino with 
non-zero mass is either described by a 
Dirac spinor with two Weyl spinors con- 
tributing but with one sterile helicity 
state or as a massive Majorana particle 
having only one helicity state. 


54See Exercise 6.4. 


164 Relativistic quantum mechanics 


55For an introduction to propagators, 
see Chapter 1 and, for example, [84] for 
greater detail. 


56Tn this context, ‘lowest order’ refers 
to a perturbative expansion in powers 
of the coupling constant—here a? /4r. 


Fig. 6.5 (a) Feynman diagram for 
e +p > e +p`. This is in fact 
the sum of two diagrams. The vertical 
wiggly line representing the exchanged 
particle propagator is the sum of two 
scenarios depending on which particle 
was the emitter and which was the ab- 
sorber of the exchanged particle (e.g. a 
photon or Z°) as sketched in (b). Time 
goes from left to right. 


no cross terms like Uyy“upr are present. In the high-energy limit where 
masses can be neglected, 


(l-yY)u=ur~Yup is the helicity — eigenstate 


(1+ 7°)u=ur ~ uk is the helicity + eigenstate 


le NI= 


and therefore the probability current does not contain any helicity- 
mixing terms like wy, yut. We will now discuss the significance of this. 

In classical electromagnetism, two parallel wires carrying electric cur- 
rents interact with each other with a force proportional to the product 
of the currents. We can try a similar idea to describe scattering of par- 
ticles. Multiplying the probability current of a free electron by its electric 
charge will give us a 4-vector with time-like component representing the 
charge density and space-like component representing the number of 
electric charges crossing unit area per unit time, i.e. the electric cur- 
rent corresponding to moving electrons. If we now take that current 
and another one, for, say, a free muon, then we can expect that the dot 
product of these currents will have something to do with the electromag- 
netic interaction of these particles. Multiplying that by a propagator®° 
(which contains the gay tensor for the above dot product) allows for 
momentum transfer between the interacting particles and indeed gives 
the matrix element of the lowest-order approximation to the scatter- 
ing amplitude. That matrix element can be visualized by a Feynman 
diagram (see Fig. 6.5). 

It is a property of the Standard Model (SM) that interactions between 
any two SM fermions can be described, in a leading approximation, by 
the current—current interaction as outlined above, although in the case 
of the weak interactions the left- and right-handed parts have to be 
treated separately because the weak interaction bosons couple to them 
differently (see Chapter 7). Since in the probability current there are 
no helicity-mixing terms like uy yut, there are only two fundamental 
SM vertices, as illustrated in Fig. 6.6, where f stands for a fermion 
and the wiggly line of the exchanged particle represents one of the SM 
vector bosons: the photon, W+, W7, Z°, or any of the eight gluons. 


(a) eT eT 
Y 
(b) H u 
e -a e7 
"y + T 
He He He u 


The helicity is the same before and after the scattering (coupling to the 
exchanged particle). One says that the helicity is conserved, remaining 
unchanged by the interaction. 

The annihilation or pair creation vertices are obtained from the scat- 
tering ones by crossing symmetry (see e.g. [13] or [84]). Keeping in mind 
that in Fig. 6.6, time is going from left to right, we ‘cross’ the incoming 
particle to the other side of the reaction equation by inverting its 4- 
momentum and swapping the helicity state so that it travels backwards 
in time, representing the outgoing antiparticle with the opposite helicity 
travelling forwards in time” (Fig. 6.6): 

T+ (a) = ul (pje? + ul (—pjet?* = uM (per 
Crossing symmetry allows us to obtain the annihilation amplitude, 
for example for et + e7 — pt + u`, from the scattering one, for 
e +p —> e +p, by ‘crossing’ the relevant particles to the other 
side of the reaction equation by changing in the scattering amplitude the 
corresponding 4-momenta p —> —p, the helicities, and the spinors u —> v, 
and—something that is beyond the formalism we are using—putting in 
‘by hand’ the minus sign in front of the whole amplitude (QFT is needed 


f(E Pi, helicity +) S(E,2; Po, helicity +) 


f(E,2: Po; helicity +) [(E,2; Po, helicity +) 
JOE p: Py, helicity -) f(E Py, helicity -) 
f(Ep: Pı, helicity ) f(Epe, Po, helicity ) 
J(Ep2 Pa helicity —) {E2 Po, helicity 
p p2? F2 
fCEpi; -P1,; helicity +) (Ep, Pi helicity +) 
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57The arrow on the fermion line indi- 
cates in which direction with respect 
to the time arrow the fermion specified 
by the label at the end of the line is 
travelling. 


Fig. 6.6 Fundamental vertices of the 
Standard Model in the high-energy 
limit where masses can be neglected. 


166 Relativistic quantum mechanics 


58 There is also a u-channel process; see 
e.g. [84]. 


59 For proofs of all the results men- 
tioned in this section, see e.g. [52]. 


601 can also be shown that if the wave- 
functions transform this way, then the 
Dirac equation is covariant with respect 
to this coordinate transformation. 


for that). This transformation also affects the description of the propa- 
gator. In the scattering amplitude, we have, in the example considered, 
4-momenta pl and p2 for the incoming and outgoing electrons, respect- 
ively, and therefore the 4-momentum of the exchanged boson is the 
difference p1 — p2, which, dotted with itself, gives t = (pl—p2)-(pl—p2); 
for that reason, the scattering is called the t-channel process. With the 
change p2 + —p2, t > s = (pl + p2) - (pl + p2), which is the square of 
the centre-of-mass energy; the resulting annihilation reaction is therefore 
called the s-channel process.°® 


6.4.4 P,T, and a comment on C 


By construction, the Dirac equation is covariant with respect to Lorentz 
transformations and space inversion. It is also covariant with respect 
to time inversion. Space inversion P : r + —r and time inversion T : 
t + —t are discussed in this section. We make a comment about charge 
conjugation C : particle > antiparticle at the end of this section. 

Consider two observers with reference frames O and O’. They are 
describing the same system or physical process, using their coordinates 
and wavefunctions U(x) and W’(x’), respectively. The coordinates are 
related by a linear coordinate transformation (2”)’ = aja", where aj, 
could be the Lorentz transformation or a space or time inversion. 
The gamma matrices in their Dirac equations are also changed by 
that transformation, but, after a lot of algebra, it can be shown that 
both observers can neglect differences between their gamma matrices.°? 
The covariance of the Dirac equation requires that the wavefunctions 
transform as°° 


where S(a) is a matrix, S7! (a), exists and S~1(a) = S(a~'). One says 
that the coordinate transformation a induces a transformation S(a) in 
the space of wavefunctions: Y'(x') = S(a) V(x). 


Space inversion P 


The space inversion transformation is 


1 0 0 0 
a- [0 -2 0 0 
0 0 -1 0 
6 0 O =1 


We want to find S(a) satisfying W’(2’) = S(a)W(x). Considering, for 
example, Y+ (x) = u(e—'”'* and following the expectation from classical 


physics that r + —r makes r’ = —r, p’ = —p and o’ = o (the angular 
momentum axial vector), we obtain 
4) 
+y i = 1 1 
(UT) (x) = o -p go) 
Ey +m 


exp[—i(Ept — p’ - x’)] 


gt) 
=| =O°P 40) | expl-i(Ept — p: x)] 
Ey +m 


= 70" (2) 


giving S = y°, up to a fixed phase, which is set to 1 by convention. 
We could get that result immediately by looking at the transformation 
properties of Weyl spinors (see eqn 6.9) and the transformation from the 
Weyl to the Dirac representation (eqn 6.45). 

Eigenstates of the parity operator S = 7° are states of defined intrinsic 
parity. Positive-energy states in the particle rest frame, 


eer, y2erimr 
are eigenstates of y° with eigenvalue +1, i.e. they have intrinsic parity 
+1, but negative-energy states 

gen, yretimr 
are eigenstates with eigenvalue —1, thus having intrinsic parity —1. We 
can state, then, that the intrinsic parity of a spin-4 particle (defined in 
its rest frame) is the negative of the intrinsic parity of its antiparticle. 


This applies to higher-spin fermions as well. Free-particle states with 
p Æ 0 are not eigenstates of the parity operator ¥°. 


Time inversion T 


For time inversion, 


-1 0 0 0 
a-[|0 10 0 
0 01 0 
i 0 0 1 


The derivation of S (a) here is more complicated than for space inversion. 
We consider only one aspect and then give the result. From classical 
physics, we expect that t + —t makes r’ = r, p’ = —p, and t’ = —t, and 
therefore 


exp[—i(Ept’ — p’ - x’)] = exp[—i(—Ept + p- x)] 
= {exp[—i(Ept — p - x)]}* 
so complex conjugation is involved. The full derivation gives W’(t') = 
S(a)W(t) = i717? Y(t)", again up to an arbitrary fixed phase, which, by 
convention, is taken to be 1. 
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6lThe SM does predict a very tiny 
violation of time-inversion symmetry, 
leading to the prediction of a very tiny 
d, much smaller than this limit. A num- 
ber of extensions of the SM have been 
rejected because they predicted d to be 
larger than the current limit. 


It is useful to consider how S(T) = S(a) acts on a free-particle state, 
for example 


9) 
S(T) o'p RIES) exp[—i(Ept — p: x)] 
Ey +m 
9g) 
=-i| o-p’ 92) exp[—i(Ept’ — p’ - x’)] 
Ey +m 


Note that 0 — V), but the helicity does not flip as well because 
the direction of the momentum changes too, so a positive-helicity state 
remains a positive-helicity state in the time-reversed system. 

S(T) is anti-unitary, $?(T) = —1, and this has interesting conse- 
quences. If we have interactions that are invariant with respect to time 
inversion, then S(T) commutes with the Hamiltonian. If Y is an ei- 
genstate of the Hamiltonian, then S(T)W is also an eigenstate of the 
Hamiltonian, with the same energy. But if S(T) = CW, where ¢ is a 
phase, then 


S(T)S(T)V = S(T)CW = CCU = Y 


in contradiction to S?(T) = —1. For that reason, $(T)W is a different 
state to W; there is at least a twofold degeneracy, known as Kramer’s 
degeneracy. A spin-4 particle has a natural twofold degeneracy due to 
the two spin projections (2j + 1 states). A static magnetic field can lift 
that degeneracy by coupling to the particle’s magnetic dipole moment, 
but a static magnetic field is not invariant with respect to time inversion, 
so nothing is wrong with having non-degenerate states in the magnetic 
field. The situation is different if the particle is put into a static electric 
field, which is invariant with respect to time inversion. If the particle has 
an electric dipole moment, the electric field will couple to it and would lift 
the 2j+1 degeneracy, shifting the energy levels up or down depending on 
the electric dipole orientation. Kramer’s degeneracy forbids this because 
shifted energy states are required to be at least twofold-degenerate. So, 
unless there is an extra degree of freedom to guarantee this, electric 
dipole moments are forbidden by Kramer’s degeneracy. A molecule of 
water has a large electric dipole moment—but in molecular or atomic 
systems, there are extra energy-degenerate (or nearly degenerate) states 
that allow this (see e.g. [126]). 

In a series of beautiful experiments, pioneered by N. F. Ramsey |120, 
121], using a beam of ultra-cold neutrons coming from a nuclear reactor 
at the Institut Laue-Langevin (ILL) in Grenoble, the absolute value of 
the neutron electron dipole moment d was measured to be smaller than 
2.9 x 10-7 ecm at 90% confidence level.®! 


A comment on C 


For every particle, there is a corresponding antiparticle (and vice 
versa). This symmetry of nature is called charge-conjugation symmetry. 


It has nothing to do with Lorentz invariance and is the same in all 
inertial frames. Charge conjugation leads to a unitary operator C in 
RQF (all states have positive energies in RQF). In RQM, one can con- 
struct an anti-unitary operator transforming a positive-energy state 
into a negative-energy state (spinor u — spinor v). Because it is 
anti-unitary, i.e. different from the common unitary charge-conjugation 
operator of RQF, we will not spend time deriving it, to avoid possible 
confusion. 

We will, however, briefly outline basic properties of charge conjuga- 
tion (see e.g. [118]). C changes the sign of the electric charge, magnetic 
moment, baryon number, and lepton number. Dynamical quantities like 
the energy, momentum, and helicity are left unchanged. Except for the 
weak interaction, all the interactions obey charge-conjugation symmetry. 
Eigenstates of charge conjugation are neutral particles like the photon, 
n°, n, and o°, with eigenvalue +1 or —1. It is —1 for the photon and 
therefore +1 for a two-photon state, and so, as the 7° has eigenvalue 
+1, the decay 7° — yy is allowed. However, charge-conjugation sym- 
metry forbids 7° from decaying into three photons (eigenvalue —1). For 
particles built out of two fermions, f f, the eigenvalue is (—1)'+5, where 
S is the total spin of the ff state and l is the relative orbital angular 
momentum of the fermions. The e*e~ bound state, positronium, decays 
to two photons if it is in the singlet state (S = 0, l = 0) and to three 
photons if it is in the triplet state (S = 1,1 = 0). 

In conclusion, we note that the weak interaction violates the sym- 
metry of each of the three discrete transformations P, T, and C, as well 
as any superposition of any two of them. For all other interactions, each 
of the three discrete transformations is a symmetry of the interaction. 
But all known interactions, including the weak interaction, obey CPT 
symmetry.°? In RQF, it is impossible to construct Lorentz-invariant 
interactions that would violate CPT symmetry. 


6.4.5 Electromagnetic interactions 
and the non-relativistic limit 


So far, we have considered only the free-particle Dirac equation. The 
interaction of an electron (or indeed any electrically charged spin-4 
structureless fermion) with a classical electromagnetic field is introduced 
through so-called ‘minimal coupling’, following the prescription by which 
the electromagnetic interactions are introduced in classical mechanics. 
The physics of this prescription will be discussed in Section 6.5. 

The classical electromagnetic field is given by a 4-vector potential 
A" = (@, A) and its interaction is introduced into the Dirac equation by 
modifications of the derivatives: 


—iV > —iV — qA 
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62t should be noted that in RQF, CPT 
is an anti-unitary operator. 
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c3 Namely, 


(o-A)(o-A)=A-A+io-(xA) 
(Vx A+AxV)f=f(V x A)=fB 


where q < 0 is the electron charge. The Dirac equation then becomes 
o 
E — a) Wr, t) = [a- (—iV — qA) + Bm] V(r, t) (6.54) 


Solutions of this equation exist for some cases of the electromagnetic 
field (see e.g. [96]). We will find the non-relativistic limit of eqn 6.54 by 
applying an iterative method of approximation. 

At low energies, the mass m of a particle is the main part of the energy, 
so we will factor that part out of the wavefunction: 


ae) 


wu(r,t) and wy(r,t) are known as the upper and lower components of 
the Dirac spinor. They contain all ‘non-rest’ energy information relevant 
for the non-relativistic limit. Inserting this Y into eqn 6.54, we find 


ŽE L o (p -gAV + adv (6.55) 
YL 
ŽEL L o. (p—gAyu tagv,—2mv, (656) 


Equation 6.56 may be rearranged to give 


_O 
o: (P-A); ae 


2m 


(6.57) 


Because m is relatively large, wy is small compared with wy. In the 
first-order approximation, we neglect the last term in eqn 6.57 and take 


o-(p — qA) 
2m 


vu 


Inserting this into eqn 6.55, we obtain the first-order approximation wWy1 
for wu: 


oi = vii = 


.O o-(p—qA)||o-(p—qA 
Pla a gg SGA leew), 
Ot 2m 
Using vector identities,’ this equation can be written as 
„0 — qA}? 
egn ba AA 
Ot 2m 2m 
and finally as 
„0 
i vu = Apu 
where 
— gA)? 
Hp = qo + (P = ) 2 o-B 


is the Pauli Hamiltonian for the Schrödinger equation and B is the 
magnetic field. Since the Hamiltonian of the interaction of a magnetic 
dipole moment u with an external magnetic field B is —u- B, we can 
identify (q¢/2m)o with the electron magnetic dipole moment u. On the 
other hand, the electron magnetic moment is related to the electron spin 
by u = —g40 up, where g is the proportionality factor and ug = |q|/2m 
is the Bohr magneton (q is the electron charge). From this, we find that 
g = 2, which was a triumph for Dirac and his equation. 

The value of g was known from atomic physics measurements, but 
until the advent of the Dirac equation, there was no explanation why 
the experimental value was as it was. In fact, the value of g for the 
electron, as well as for the muon, is not exactly 2. The small difference 
is explained by RQF. The value of (g — 2)/2 has been measured with 
fantastic precision: 2.8 x 1071? for the electron [85] and 6 x 10~!° for 
the muon [49]. As for the precision measurements of the neutron electric 
dipole moment, in these cases also the precision of the measurements 
and theoretical calculations give constraints on extensions of the SM. 

It should be noted that the magnetic dipole moment of the proton 
is quite different from that predicted by the Dirac equation. The value 
of g is about 5.6 instead of about 2 (using the proper magneton for the 
proton). For the neutron, it is even more surprising—instead of 0 because 
the electric charge is 0, g is about —3.8. These significant deviations 
from the predictions of the Dirac equation, which are applicable for 
point-like particles, were the first indications that protons and neutrons 
are not point-like particles. As we now know, they have a complicated 
substructures of quarks and gluons. 

We can carry on with the iterative procedure and get relativistic cor- 
rections beyond the Pauli Hamiltonian. The next one is obtained by 
substituting wy, for Y, in the last term of eqn 6.57, thus giving the 
second-order approximation: 


„0 


o 
pu = pr = om F 


Yu 
Unfortunately, inserting this into eqn 6.55 gives a Hamiltonian that is 
not Hermitian and the electron acquires an imaginary electric dipole mo- 
ment. Formal problems of this type took many years to solve after Dirac 
published his equation. Eventually, a consistent second-order Hermitian 
Hamiltonian was found: 


(p—-4A)? _ |p|* 
2m 8m3 


+q¢---o-B 
2m 


H= 
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Fig. 6.7 Cartesian and spherical 
coordinate systems. 


The first line in this expression for the Hamiltonian represents the kinetic 
term with its first relativistic correction, the third the Darwin term, and 
the fourth the spin-orbit interaction (where E is the electric field). For 
E = —V¢ and a spherically symmetric potential ¢, o- (V x E) = 0 and 
the spin-orbit term takes the familiar form 


Hso = —739: (E x p) 


where l is the orbital angular momentum of the electron. It should be 
noted that Thomas precession is automatically included, as it should be, 
in the relativistic formalism. 

Historically, the first formally successful derivation of the non- 
relativistic limit of the Dirac equation that included interactions with 
the (classical) electromagnetic field was obtained via the FW transform- 
ation. It is interesting that one has to introduce the interaction first 
into the Dirac equation in the Weyl or Dirac representation (or any 
other representation related to them by a transformation not depend- 
ing on the momentum) and only then make the FW transformation. 
One might think that doing the FW transformation for the free-particle 
Dirac equation first, thus decoupling the lower and upper components 
of the Dirac spinor, and then introducing the interactions via ‘minimal 
coupling’ would work—but it does not. 


6.5 Gauge symmetry 


By considering transformations of space-time, one gets a relativistic de- 
scription of free particles. In order to describe their interactions, one 
needs to consider transformations, gauge transformations, in another 
space, an internal space. Symmetries of the gauge transformations in 
that internal space are at the heart of the SM. Before discussing gauge 
symmetries, we need to revise the three essential ingredients of the 
formalism. 


6.5.1 Covariant derivative 


Consider two coordinate systems as sketched in Fig. 6.7. The spherical 
basis vectors are related to the Cartesian ones in the following way: 


ð ð 

e, = Z e, + Ye, = cos 9 es + sin Y ey, le,| = 1 
Or Or 
ð ð 

ey = aAA Denz rsin Y Ez +r cos Y ey, le,|=r 
dp dy 
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Next introduce a constant vector field a of unit length, as sketched in 
Fig. 6.8, using both coordinate systems. In the Cartesian basis, 


a=le,+0e, = a” ez +a” ey 


and in the spherical basis, 


l. 
a = cos Y er — —sinye, = a” er +a? eg 
F 


Although 
ða” Oa" ða” ða” 0 
Ox Oy Ox Oy 
in the spherical basis, 
ve __ ine 4 a pe _ A ra 4 x Fig. 6.8 A constant vector field a. 


which looks wrong because the field a is constant—nothing is chan- 
ging from point to point. It is wrong because the differentiation has not 
taken into account that the spherical basis vectors change from one space 
point to another. If we take this into account, differentiating not only 
coordinates but also basis vectors, everything is fine. For example, 


aa Ġ wati e 
Om Oe OS p Cr — 7 SiN P ep 


ER tea )e, + cos Liei 
Eo o ap 


: 1 1 E, 
= —sing er + cos | ~ ey ] — ~cosp ey y sing (—re,) 
r 
=0 
So, in general, for a vector V, 


OV _OV* 06a 
Ox8 xb | ' OxP 


The last derivative, Oe, / Ox°, is a vector and can be described as a linear 
combination of the basis vectors ea: 


The geometrical object T4 g is called a connection. Changing the names 
of the indices in the above equation, 4 + a and a — u, we can write 


av (ave : 
oe & VT Sa) i 


This is the covariant derivative, which takes care of the changing 
coordinates as well as the basis vectors. 
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64n manifestly Lorentz-covariant form 
(and with c = 1) 
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Fig. 6.9 Diagram of a two-slit 
electron diffraction experiment 
demonstrating the Aharonov-Bohm 
effect. 


6.5.2 Gauge invariance in electromagnetism 


In classical electromagnetism, because V - B = 0, we can introduce a 
vector potential A, such that the magnetic field B = V x A. Then 


10B 


10A 
V x E+ —-— =0 can be written as vx (+128) <0 
c Ot c 


allowing the introduction of a scalar potential ¢, such that 


10A 
E+-——=-V 
c Ot $ 
which, after rearrangement, gives the electric field as 
13A 
E = —-V¢- -—— 
c Ot 


If we now take an arbitrary but differentiable scalar function \(r,t) and 
make the transformation 


A> A’=A+VA (6.58) 


then B remains unchanged. If, simultaneously with this transformation, 
we make the change 


10X 

>d =p- 

ood c Ot 

then E will also stay unchanged. The transformations in eqns 6.58 and 

6.59 are called gauge transformations®* and the fact that electric and 
magnetic fields stay the same is called gauge invariance. 


(6.59) 


6.5.3 The Aharonov-Bohm effect 


Following Feynman et al. [75], we will examine two-slit electron diffrac- 
tion. To study the Aharonov-Bohm effect, we use the set-up sketched in 
Fig. 6.9. With no current in the solenoid, electrons are diffracted by the 
slits and form an interference pattern on the screen. As soon as current is 
flowing through the solenoid, the diffraction pattern changes to another 
one. The probability amplitude %ı for an electron to follow path 1 and 
the amplitude wy». for the path 2 are modified in the following way: 


pı = Wo1 exp (- a ; we = Wo2 exp (=) 


where Wo, and Wo2 are the amplitudes without the current, q is the 
electron charge, and Sı and S» are extra phases due to the presence of 
the current in the solenoid, which are given by 


=t J A-dr 


path2 


The modification of the phase on the screen is then 


5 (1 — $2) = 2( / A-dr— I Aar) 
path1 path2 
= - f A-dr 
closed 
path 


which, by Stokes’ theorem, is proportional to the magnetic flux in the 
solenoid. 

For an infinitely long thin (and therefore not obstructing the slits) 
solenoid, there is no magnetic field B outside the solenoid. The ex- 
perimentally observed modification of the interference pattern [60] is 
therefore due to the vector potential A, present outside the solenoid, 
with a magnitude inversely proportional to the distance from the solen- 
oid axis. Classically, the magnetic field B and the vector potential A 
are equivalent, in the sense that one can use either. This is not true at 
the quantum level, where A is apparently more fundamental. It should 
be noted that the same applies to the scalar potential ¢ and the electric 
field E, where instead of space dimensions one considers time. 

The final conclusion that we take from the Aharonov-Bohm effect 
is the observation that the vector potential A is related to the space- 
dependent phase of the probability amplitude and the scalar potential 
@ is related to the time-dependent phase. So, from the phase of the 
probability amplitude, we can get the 4-vector potential, and the other 
way around. 


6.5.4 Interactions from gauge symmetry 


The electromagnetic interactions have been introduced into the Dirac 
equation following the classical ‘minimal coupling’ procedure, which 
required modification of the derivatives: 


o o 
an > D’ = — + ig, V>D=V-—iqA 
ot Ot FIDA A 

which can be written in manifestly Lorentz-covariant form as 


D¥ = o” + iq A“ (6.60) 


The derivative D” is the covariant derivative and we can start thinking 
about the scalar potential ¢ and the vector potential A as connections 
in a space to be defined. The outcome is the Dirac equation for the 
electron interacting with a classical electromagnetic field represented by 
the 4-vector potential (¢, A): 


E - a) V(r,t) = [æ (iV — qA) + Bm]W(x,t) (6.61) 
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Fig. 6.10 A particle travelling in 
space-time and its internal space. 
Adapted from [111]. 


Fig. 6.11 The internal space of 
electromagnetism. Adapted from [111]. 


65 The formalism will only be outlined 
here. For more details, see [111]. 


AN Fibre 

oO 

9 <— Phase 

Q 

N 

T 

S é 

a Particle 

E space-time 

laap! 5 
trajectory 


Space-time > 


We know that the potentials are not unique and we can perform the 
gauge transformations 6.58 and 6.59 without affecting Maxwell’s equa- 
tions and their solutions. Would the same apply to the Dirac equation 
6.61? The answer is negative. A solution of eqn 6.61 will differ from that 
obtained by solving eqn 6.61 after the transformations 6.58 and 6.59. But 
if, simultaneously with 6.58 and 6.59, we also make the transformation 


T(r, t) > V(r,t) = expligà(r, t)] U(r, t) (6.62) 


then the Dirac equation 6.61 will be covariant with respect to the 
three combined gauge transformations 6.58, 6.59, and 6.62, and the 
solution will describe the same physics as that of eqn 6.61 before the 
transformations. We see that changes to the 4-vector potential affect 
the space-time-dependent phase of the wavefunction; it needs to be 
changed accordingly as well. As in the Aharonov-Bohm effect, they are 
connected. 

Now comes the crucial step. We reverse the flow of arguments. We first 
demand that we want interactions that are invariant with respect to the 
transformation 6.62. For that to happen, we need to modify the deriva- 
tives of the free-particle Dirac equation in such a way that the symmetry 
is obeyed. This requires a move from space-time derivatives to covari- 
ant derivatives and the introduction of the 4-vector potential (¢, A). So, 
demanding gauge symmetry with respect to the gauge transformation 
6.62, we get the classical electromagnetic field with which our particle is 
interacting. If, in addition, in all formulae, for example the one for the 
probability current, we replace derivatives by covariant derivatives, they 
will also be form-invariant. 

The last step is to introduce the internal space with its basis vectors 
on which the connection (the 4-vector potential) operates and extend 
the formalism to other interactions beyond electromagnetism.®° 

We can imagine that a particle, moving in space-time, is carrying its 
internal space with it as sketched in Fig. 6.10. In mathematical language, 
the internal space is called a fibre and the whole structure, which locally 
is a product of the fibre and space-time, is called a fibre bundle. In the 
case of electromagnetism, the fibre is a circle, as sketched in Fig. 6.11. 
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Each point on the circle, parameterized by a real function A(r,t), cor- 
responds to a complex number with modulus 1, i.e. a one-dimensional 
unitary matrix exp[igA(r, t)]. The set of all such matrices forms a group 
called U(1), with ‘1’ for one-dimensionality. When the particle moves 
from one space-time point to another, the phase of its wavefunction is 
modified (eqn 6.62) and the 4-vector potential (¢, A) is modified as well 
(eqns 6.59 and 6.58). A basis in the space U(1) consists of one particular 
matrix, for example the unit matrix (in this case the real number 1), 
which corresponds to À = 0. 

When the particle moves in space-time, (r,t) changes corresponding 
to the changing basis.®° The change of basis is represented by the connec- °° With respect to four coordinates, re- 
tion, which in turn gives us the change in the 4-vector potential (¢, A), quiring four derivatives: one time and 
eqns 6.59 and 6.58. So our ‘differential’ equation is: the connection equals three gpega directions: 
the 4-vector potential. 

We are ready now to extend the formalism to include other inter- 
actions. What will happen if in eqn 6.62, instead of the real function 
A(r, t), we insert a matrix M (r, t)?67 Suppose that M belongs to SU(2), ©’ Noting that 
the group of 2 x 2 complex unitary matrices with unit determinant (as exp(M) =14+M 4 T ee 
indicated by the letter S = ‘special’). SU(2) has 3 basis vectors, which 2 
can be the Pauli matrices og, Cy, and o,. Each element of the group 
can then be defined by 3 real numbers \!, \?, and à’, and can be rep- 
resented as a point within a sphere of radius 27 in 3 dimensions. So 
our particle carries with itself such a sphere, its internal space, and the 
interaction that we get in this way will be the weak interaction. General- 
izing eqn 6.62, we require gauge symmetry with respect to the following 
transformation: 


3 


U(r, t) > V' (r,t) = exp ja SoM (r, thor 
k=1 


U(r, t) (6.63) 


The operator acting on the wavefunction W is now a series of 2 x 2 matri- 
ces and therefore our wavefunction UV gets one extra dimension and is 
represented by two components (for historical reasons called projections 
of the isotopic spin) related to the SU(2) internal space. In order to get 
the 4-vector potential of the weak interaction, we consider the infinitesi- 
mal change in the wavefunction when the space-time point is changed 
from (r,t) to (r + dr,t + dt). All components of the wavefunction are 
affected, but to get the 4-vector potential we select only the connection 
part of the covariant derivative, i.e. the change in the basis vectors in 
the internal space. That change is represented by the changes in A}, \?, 
and à, giving, finally, 


3 
At = X [a"M (x, t)lox (6.64) 
k=1 
Therefore, if we want to introduce the weak interactions into the 
free-particle Dirac equation, we must change space-time derivatives to 
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68The 4-vector potential is a 2 x 2 ma- 
trix in the isotopic spin space generated 
by the Pauli matrices. 


covariant derivatives, including the connection, i.e. the 4-vector potential 
given by eqn 6.64.68 

The strong interaction is introduced in a similar way. The internal 
space is now SU(3), a group of unitary complex matrices with unit de- 
terminant. There are 8 basis vectors in that space, and therefore 8 real 
numbers identify each matrix belonging to the group. The Pauli matrices 
generating the weak interactions are replaced by these 8 basis matrices 
of the SU(3) group and the wavefunction gains extra degrees of free- 
dom, becoming a 3-dimensional column vector in the colour space of the 
strong interaction. The corresponding 4-vector potential becomes a 3 x 3 
matrix. Changing derivatives to covariant derivatives causes the particle 
to interact with the classical colour field. 

After quantization of the classical weak and strong fields, we have 
three spin-1 bosons for the weak interaction and 8 spin-1 gluons for the 
strong interaction. All of them are massless at this stage. To get massive 
physical Z°, W+ and W~ bosons, we need an extra mechanism, like the 
Higgs mechanism that will be discussed in Chapter 12. Elements of U(1) 
commute among themselves, but those of SU(2) or SU(3) do not. The 
physical consequences are that photons do not interact with each other 
but weak-interaction bosons and strong-interaction gluons do. 

For further reading, [13] is particularly recommended. 


Chapter summary 


e Special relativity, Lorentz transformation, invariance, and covariance 
e Klein—Gordon and Dirac equations, covariant derivative 


e Introduction of electromagnetic interactions, spin-orbit interaction 
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Exercises 


(6.1) 


(6.2) 


Using the definitions given in Section 6.3, derive the 
continuity equation 6.21 and show that it may also 
be written in the covariant form „j“ = 0. 


Follow Dirac’s derivation of ‘a relativistic equation 
for the electron’. Start from (p? — m?)w = 0, where 
p? = pi — |p|? and c = 1. Factorize the 4-momentum 
operator as 


(po + Vp? + m?) (po — Vp? +m?) 


and apply only the second bracket to give 


(po — VP? + m?)p =0 


The difficulty is that quantum mechanics gives 
operator expressions for po and p;, which in the 
Schrödinger representation are first-order deriva- 
tives in time and space, respectively. The spatial 
derivatives are now under a square root. Dirac 
solved the problem by writing 


Vp? + m? = api + azp2 + a3p3 + 8 
where 


Po = ae 


(r= 1, 2,3), Ot 


Show that to satisfy E? = p? + m?, it is necessary 
that 


QiQj + QjQAi = 26%; 
a;B + Bay = 0 


(i,j = 1,2,3), with 6? = 1. 

Dirac showed that the simplest realization of 
these constraints is given by 4-dimensional matri- 
ces. It should be noted that at this stage of the 
historical development, the wavefunction w had un- 
known Lorentz transformation properties; they had 
to be derived next. 


The covariant Dirac matrices are defined by 


P=, 7 = Bai 

so that y- p is a 4-vector scalar product—often 
written as ø, with the Dirac equation then being 
(J — m)w = 0. Using the definitions of the Dirac 


(6.6) 


matrices given in Section 6.4, follow the text to 
derive the defining relation 


HYV 


PA E = 2g (6.65) 
Using the information given at the end of Sec- 
tion 6.4.2 on helicity conservation in m™ decays, 
estimate the relative decay rates for eve and pv, 
modes. (The two-body phase factor is discussed in 


Chapter 2.) 


Write down the Dirac equation for free electrons. 
Derive the adjoint Dirac equation and an expression 
for the probability current 4-vector. 

The free-electron positive-energy solutions of the 
Dirac equation in the standard representation are 


s 


- i Et)/h 
Us(r, t) =N o-p a(Pr- t)/ 
E+ m* 
1 1 2 0 ; 
where s = 1,2, y = o and x^ = i): Find a 


normalization factor N and explain why these wave- 
functions are not normalized to one particle per unit 
volume. 

The operator for spin projection on the x axis is 


1 O1 0 
M1 == 
Find its eigenvalues and eigenvectors, and show 
that in the non-relativistic limit one can form lin- 
ear combinations of q1(r,t) and qe(r,t) that are 


eigenvectors of 4; but at relativistic energies this is 
impossible. 


Given that U(x) = u(p)e”* is a solution of the 
Dirac equation, where t(p)u(p) = 2m, derive ex- 
pressions for u(p) with p = (E,0,0, pz). Here, p 
is the 4-momentum, p-p = m°, and x is the 
4-displacement. 

What are the u(p) spinors in the ultrarelativistic 
limit? What are their helicities? Why is the Dirac 
representation more suitable in the low-energy 
limit? 

A free-particle solution of the Dirac equation is 
W(x) = C(p)e'?'”, where p = (E,0,0,p3). Find 
the normalized E > 0 spinors C4+(p) and C_(p) 
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for positive- and negative-helicity states in the 
standard Dirac representation and using covariant 
normalization. 

The operator for spin projection on the x axis is 


= al O1 0 
meala a) 
Show that neither C4 (p) nor C- (p) is an eigenstate 


of 4. 
The expectation value of X; is 


(21) = zp n)DC() 


where C = Cy (p)cosa + C_(p)sina, with a a 
real constant, and normalization is to unit vol- 
ume. Calculate (X1) and interpret the result in 


the non-relativistic and ultrarelativistic limits for 
pI 
eF 


Q&Q = {q Ma 


(6.8) Give the expression for the total energy operator H, 


including the electromagnetic potential (A°, A), for 
a particle of mass m. 

Introducing the upper and lower bispinor com- 
ponents (y,x) of y and writing H=m + Hh, 
derive an expression for Hnw. Using Hny, show 
that, under certain conditions to be specified, 
x ~ (velocity/c) x y, for a stationary state 
Hy = Ew. 

Under these conditions, show how the 
Dirac equation reduces to the non-relativistic 
Schrédinger—Pauli equation for a spin-5 particle 
with energy En = E — m. Identify the term giv- 
ing the interaction of the particle’s spin with the 
magnetic field B = V x A and comment on its 
magnitude. 

[The identity (o-a)(o-b) = a-b+io-ax b may 
be assumed.| 


Weak interactions 


This chapter introduces the theory of weak interactions and electroweak 
unification. The experimental evidence for this theory will be reviewed 
in Chapter 8. We give a description of electroweak interactions with 
emphasis on the key physics ingredients. More mathematical treatments 
(graduate-level textbooks) are listed in Further Reading at the end of the 
chapter. The Higgs mechanism and supporting experimental evidence 
will be covered in Chapter 12. 

The weak force is responsible for the 8 decay of nuclei, for the decay of 
the lightest hadrons such as the pion, kaon, and neutron, and for neutrino 
interactions. The weak interaction violates spatial parity conservation 
and facilitates flavour-changing interactions. The quanta of the weak 
force are the massive W and Z°. Weak interactions mediated by the 
WŒ are termed ‘charged-current’ (CC) and those by the Z° ‘neutral- 
current’ (NC). In this chapter, the evidence for the ‘universality’ of the 
weak interaction is outlined and its importance explained. Figure 7.1 
shows neutron 8 decay at the quark level; it is a CC decay. At this 
energy scale (~ 1GeV), the W will hardly propagate and it is safe to 
approximate the propagator g2,/(Mj, — q”) by a constant, which we will 
see is closely related to the Fermi constant Gp (the concept of a force 
propagator is outlined at the start of Chapter 9). 

The proposal by Glashow, Weinberg, and Salam (independently) that 
the weak and electromagnetic interactions should be unified was a key 
step for particle physics. At first sight, this seems difficult to achieve 
because the photon is massless but the weak bosons are not—the 
Higgs mechanism for mass generation is crucial for this. The discov- 
ery of ‘weak neutral current’ interactions in neutrino scattering was 
the first indication that these ideas might be correct. The discovery 
in 1983 by the UA1 and UA2 experiments at the SppS collider at 
CERN of the W* and Z? particles with masses close to the pre- 
dicted values was a seminal moment in the development of the Standard 
Model. 

Other topics covered in this chapter are 


e the generalization of the idea of a current—current interaction; 
e left- and right-handed particles; 


e the CKM matrix, which connects the quark mass eigenstates with 
the weak eigenstates; 


e the GIM mechanism, which explains how flavour-changing weak 
decays are suppressed despite the universality of the weak force. 


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg. © Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg 2016. Published in 2016 by Oxford University Press. 
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Fig. 7.1 Neutron 8 decay at the 
quark level, a CC weak decay. Note 
that the spectator ud pair in the n 

and p are omitted. 
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How to calculate dN/dE is covered in 
Chapter 2. 


Much of the early work on weak interactions focused on understanding 
the seeming complexity of particle decays. Experiments made possible by 
the high energies of the LEP and LHC colliders have enabled the richness 
and underlying symmetry of electroweak interactions to be understood 
at a more profound level. After a brief review of the Fermi theory of 
8 decay, the chapter is divided into three main sections dealing with 
weak interactions of leptons, weak interactions including quarks, and 
electroweak unification, respectively. 


7.1 Fermi theory 


The original theory of nuclear @ decay, by Fermi, uses his ‘Golden Rule’ 
of quantum mechanics to calculate rates. For example, the decay rate of 
the neutron (n > pe™ ve) is given by 


20 dN 
= JMP = a 
w= FMP (7.1) 
where M is the matrix element given by 
M = ic OyidV = J epecusoun dV (7.2) 


M involves the initial- and final-state wavefunctions Y; and Yp, respect- 
ively, and O is an operator describing the interaction, which was assumed 
to be local, with a strength given by Gp. This amounts to ignoring any 
propagator of the weak force. Given the very large mass of the WF bo- 
son (~80 GeV), for decays of leptons and most hadrons (masses up to a 
few GeV), the propagator g? /(M8, — q°) can be approximated by the 
constant Gp ~ g2,/M7,. The wavefunctions of the electron and the neu- 
trino were assumed to be constant over the volume of the nucleus and 
were moved outside the integral; the volume over which they are nor- 
malized cancels with the volume used to calculate the density of states 
dN/dE.' Another simplification was to use Schrödinger wavefunctions 
w rather than the Dirac spinors that are needed for a correct relativistic 
description of spin-4 particles. 

Some nuclear decays involve larger changes in nuclear spins than can 
be accommodated by the spin 0 or 1 change from the electron—neutrino 
pair. These decays tend to proceed more slowly. This leads to a jargon of 
‘forbidden decays’—for which the decay could occur with a change in the 
nuclear orbital angular momentum but more slowly than the ‘allowed 
decays’. The terminology is a bit unfortunate, but it does introduce a 
very important idea. If a process occurs at a slower rate, or not at all, 
than one would expect from the phase space available, then there could 
be an inhibition from angular momentum conservation or some other 
conserved quantum number. 
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7.2 Weak interactions of leptons 


Many of the key features of the weak interaction can be understood using 
leptonic decays and interactions without the additional complication 
of hadronic structure. Some examples of lepton properties that need 
explanation are the absence of the decay u > ey (the upper limit on the 
branching ratio u~ — e~y is 1.2 x 1071"), the very different lifetimes of 
the r+ (2.6x 1078 s) and 7° (10~1*s), and the absence of purely leptonic 
decays such as u® — e*v or T — u*v. The much higher decay rate for 
the 7° is straightforward: JPC of the 7° is 07+, so the electromagnetic 
decay 7? — yy is allowed. So far, there is no evidence that any of the 
leptons are other than point-like particles. 


7.2.1 Lepton number 


The above brief summary of leptonic properties leads to the necessity 
for new conserved quantum numbers—the lepton numbers. For example, 
we assign Le = +1 to the e~ and ve, Le = —1 to the e* and De, and 
Le = 0 to all other particles. There are exactly analogous conservation 
laws for each of the other two generations involving L, and L;.? The 
details were summarized in Chapter 1, Table 1.1, which is repeated here 
for convenience as Table 7.1. 

The lepton number conservation laws along with charge conservation 
and baryon number conservation are reflected in the fact that there 
is only a single vertex for each lepton generation that can be used in 
constructing Feynman diagrams for weak charged-current interaction. 
They are shown in Fig. 7.2. 


7.2.2 Feynman rules 


Calculations of high-energy interactions respecting Lorentz covariance 
and allowing for particle creation and annihilation are difficult even 


State Q Mass Le Ly, Lẹ Lifetime 
e =l 0.511 MeV +1 0 0 > 4.6 x 1076 years 
Ve 0 < 2eV +1 0 0 Stable 
e 105.7 MeV 0 +1 0 2.197034(21) x 107ê s 
VY, 0 < 0.19 MeV 0 +1 0 Stable 
tT  —1 1776.82+0.16MeV 0 0 +1 (290.6+1.0) x 1071 s 
vz 0 < 18.2 MeV 0 0 +1 Stable 


Table 7.1 Lepton properties. 


? Although it was assumed at the time 
of its discovery that the 7 lepton was a 
so-called sequential lepton with its own 
lepton number and an associated v+, it 
was not until 1997 that the DONUT 
experiment confirmed directly that av, 
beam produced the charged rT. 


e Ve 
WwW- 

W Vu 
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Vion Vr 
w- 


Fig. 7.2 Possible leptonic 
charged-current vertices. 
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3 Although internal particles do not 
have to be on the mass shell, they do 
have to be consistent with the con- 
straints of the time—energy uncertainty 
relation ABAt < h/2. 


T Vr 


Fig. 7.3 Feynman diagram for a 
tau-lepton decay to an electron and 
two neutrinos. 


4Note that the W width has been 
ignored here; it gives an imaginary 
part to the propagator, analogous to 
the expression for a Breit-Wigner 
amplitude—see Section 2.5.4. 


for weak or electromagnetic interactions. Richard Feynman invented a 
very elegant graphical formalism that provides a considerable shortcut in 
calculations. The full mathematical treatment is given in the advanced 
textbooks in Further Reading, but a brief outline is given here since 
this formalism has become an essential part of the ‘language’ of particle 
physics—it will also provide a useful bridge to the more advanced texts. 
It ends up being formulated as a set of ‘Feynman rules’, in other words 
a recipe for performing the calculation. To show that Feynman rules ac- 
tually work and give the right answer is even further beyond the scope 
of this book. 

Feynman’s recipe is as follows. First, the process from incoming par- 
ticles to outgoing particles is described in terms of ‘Feynman diagrams’ 
where lines represent ‘quanta’ or particles in momentum space. Lines 
representing the initial and final particles in an interaction must be 
‘on-mass-shell’ particles, i.e. ones where E? = p? + m?. Internal lines 
represent ‘virtual particles’, which do not have to satisfy the mass- 
shell constraint. Diagrams can have as many complicated lines as you 
like (but all connected), although with weak and electromagnetic inter- 
actions those with the fewest vertices (the lowest-order diagrams) are the 
most important. To do the calculation, all diagrams up to a given order 
must be considered and the amplitudes added together to get the matrix 
element. In this way, it is possible to have interference (constructive or 
destructive) between two diagrams. 

The second step in Feynman’s recipe is to write down elements in a 
mathematical expression for each external line, vertex, and internal line 
(propagators) according to the rules, and the third step is to evaluate the 
expression. If there are internal loops, the evaluation involves integrating 
over the unconstrained momenta of those particles. The recipe ensures 
energy and momentum conservation and it is also built into the rules 
that the resulting expression is Lorentz-invariant. 

Apart from being a precise mathematical tool for relativistic field 
theory calculations, Feynman diagrams give a very intuitive ‘billiard 
ball’ picture of what is a complicated interaction of waves and quanta. 
Other, now-commonplace ideas such as vertex factors (e.g. a = os 
for the electromagnetic interaction) and the propagators 1/q? for a 
massless photon and 1/(M?, — q*) for the W for internal lines come 
from the Feynman rules.* 


An example: tau decay 


Using the Feynman rules, we will find the lowest-order matrix elem- 
ent for a purely leptonic decay mode of the tau lepton T~ —> vre~ De 
(Fig. 7.3). The Fermi expression for this process involves an integral 
over Schrodinger wavefunctions w: 


Mg = Gr | yt piyi dV (7.3) 


7.2 Weak interactions of leptons 185 


The equivalent expression using Feynman rules and Dirac spinors is 


; 1 Iw _ 1 
Mg = Mae [Seating = autr) 


x fraen ga -Puto 


where 1/(Mj, — q?) is the propagator and the two expressions in square 
brackets are the vertex factors. This is an example of a ‘current—current’ 
interaction as described in Chapter 6. The vertex factors involve a par- 
ticular way of combining the spinors—inserting a matrix 7,,5(1 — 75) 
between the spinor and the adjoint spinor. This particular combination 
is called ‘V—A’—more details are given in Section 7.2.4. The factor gw is 
called the weak coupling constant and is analogous to the electric charge 
e = V4ra (in natural units) in electromagnetism and q? in the propa- 
gator is the four-momentum squared of the W particle. The W does not 
have to be on the mass shell, since it is a virtual particle; however, since 
q? ~ m? and m, << My, we can approximate 1/(M?, — °) > 1/Mj,, 
in which case we may use 


Gr ge 


Va 3M 


to relate the weak coupling gw to the Fermi constant Gp. The expression 
for the matrix element becomes 


Ma = SE layz A- uale) -Pue (7.8) 


which, apart from the y matrices, now looks very much like the expres- 
sion from Fermi theory in eqn 7.3. The factors of 4 and v2 are simply 
to keep the definition of Gp the same as originally defined by Fermi. 


(7.5) 


7.2.3 Universality 


The model of weak interactions contains the postulate that 


gw is the same for all weak interactions ... 


... well, all weak charged-current interactions involving leptons. To bring 
quarks into the picture, we need Cabibbo theory and its extension by 
Kobayashi and Maskawa, and to include the neutral current, we need 
the Glashow, Weinberg, and Salam theory of electroweak unification. 

There have been many experiments using a wide range of interactions 
and energies to test universality, measuring and comparing gy. So far, 
the postulate that gw is the same is holding up very well. We will discuss 
some of these experiments in Chapter 8. 
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See Section 7.3 of [80] for more details, 
starting from how a Dirac spinor trans- 
forms under a Lorentz transformation. 


Left-handed particles ... 


You will recognize from Chapter 6 that the $(1— y5) in the operator is 
the left-hand chirality projection operator P; (which, in the high-energy 
limit, is equivalent to a helicity projection). Denoting the left-handed 
part of the T as wi (T), we have 


u(r) = Peur) = 5(1 — ys)u(7) (7.7) 


This means that the ¿(1 — ys)u(T) part of eqn 7.6 can be replaced by 
u(t). We say that the weak charged-current interaction acts on only the 
left-handed part of the particle. We will formulate this more rigorously 
when we come to electroweak unification. 


. and right-handed antiparticles 
The z(a — 5) operator, when acting on the spinor of an antiparticle 
(using v rather than u to denote a spinor of an antiparticle), projects 
out the right-handed chirality of the antiparticle, i.e. 


vr = 5(1—5)0 (7.8) 


NI = 


The weak charged current acts only on the left-handed part of 
particle wavefunctions and the right-handed part of antiparticle 
wavefunctions. 


7.2.4 V—A 


We now return to the question of why the Feynman rules were con- 
structed so that the matrix between the spinors was y„4(1 — ys). The 
first step is to construct a scalar quantity (i.e. a number that is Lorentz- 
invariant) out of a spinor u. The answer is that tu is such a quantity. 
It is the simplest of the ‘bilinear covariants’ for the Dirac equation and 
transforms as a Lorentz scalar. There are 16 such combinations? and 
these can be arranged as shown in Table 7.2 to construct different types 
of Lorentz-covariant quantities. It can be shown that this is the only way 


Name Number Parity P 
of parts transformation 
uu S Scalar 1 + under P 
Uypu V Vector 4 (+,—,—,-—) 
TO pvt T Tensor 6 
UY psu A Axial vector 4 (—,+,+,+) 
UY5U P Pseudoscalar al — under P 


Cuv = Siu — wn) 


Table 7.2 Lorentz covariant combinations of spinors. 
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to do it and this is the complete set of different combinations that can 
be made. It is also interesting for this discussion to see what happens 
under a parity transformation. The combination wu remains the same 
under a parity transformation. We can also construct other numbers by 
inserting various combinations of gamma matrices in the middle. For 
example, wysu is also a Lorentz-invariant scalar; however, in this case, 
the parity operation produces —ty5u, i.e. there is a change in sign. This 
is therefore called a ‘pseudoscalar’. 

So we have a choice of S, V, T, A, or P (or some combination) for 
the interaction type for weak interactions. The second step is to turn to 
experiment to try to select the correct combination (see [57, pp. 398- 
401]). It involves (i) looking at Fermi (AJ = 0, where J is the total 
spin of the nucleus) and Gamow-Teller (AJ = 0,1) allowed 8 decays 
and comparing various features of the decay electron spectra to see if 
the Fermi transitions are S or V or a combination of the two. The 
Gamow-Teller transitions tell us about the choice between A and T. 
After considerable experimental investigation, it was determined to be 
a combination of V and A, with no contributions from S or T (P cannot 
produce a large contribution). Further measurements gave the relative 
contributions of V and A (equal) and the sign (negative). So: 


The weak interaction is V—A, ie. y,,(1 — 45). 


7.2.5 Parity violation 


We have just decided that the weak interaction has a V—A form, i.e. that 
the spinor combinations wy,,u and uy,y5u appear with equal amounts. 
The first combination is a Lorentz 4-vector and the second is an axial 
vector. We use the V—A combination to describe the parity-violating 
weak interaction: 


UYU — Uyu ysu = Uyy(1 — y5)u (7.9) 


We now turn to the properties of these quantities under a parity 
operation. Both uwy,u and Uyyysu have definite parity transformation 
properties; however, they are opposite. If the weak interaction were 
a single type, i.e. either V or A, then there would not be parity 
violation—the final state would have definite parity. The V—A com- 
bination wy,(1—ys)u, however, has a mixture of parity change or 
non-change and this results in parity violation. Since the V and A parts 
come in equal amounts, parity is said to be maximally violated. We 
will review the experimental evidence for parity violation in the weak 
interaction in Chapter 8. 


7.2.6 Currents and fields 


We now give a reminder of the meaning of a current (see Chapter 6), 
since it will feature a lot in the electroweak theory. Consider the decay 
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WwW- 


Fig. 7.4 (a) Feynman diagram for T 
decay. (b) Same interaction showing 
the concept of two interacting 
currents. (c) Same, now showing the 
concept of a current interacting with a 
field. 


6Note the absence of the projection op- 
erator because the photon couples with 
equal strength to the left-handed and 
right-handed parts. 


T As we showed in Chapter 6, the con- 
tinuity equation for the Dirac equation 
gives the expression for the Dirac spinor 
equivalent of a probability current as 


UYU. 


of the 7, as shown in Fig. 7.4(a). The contents of the square brackets in 
eqn 7.6 are each referred to as currents: 


P 1 
T= Wr) v5 (1 — 5) u(r) (7.10) 
is a charge-raising current (because the outgoing v, is one positive 
increment in charge bigger than the incoming T7). Similarly, 


J- = (eZ (1 — ulve) 


: (7.11) 


is a charge-lowering current. These charge-raising (lowering) currents are 
the reason why this sort of interaction is called a charged current—the 
particle in the middle isa W* or a W~ and charge needs to be conserved 
at each vertex. The other type of weak interaction is the neutral current 
and involves the exchange of the Z°. 

We can now write the matrix element in eqn 7.4 as 


: 1 , I” j-n 
My-? v2 


This is a so called current—current formalism of an interaction; i.e. the 
interaction occurs between the two currents, as shown in Fig. 7.4(b). 

Another variation on this theme is to treat the two halves of the 
Feynman diagram separately. We can ‘plug-and-play’, i.e. change the 
particles at one vertex (e.g. from (e, ve) to (d, u)) without changing the 
calculation at the other vertex. We do this by introducing the concept 
of a field W", 


Mg (7.12) 


Jw z+ 
= =J 
ya” 


1 Ge a: 
WAS Ta at 7.13 
Mg -P V2 ore 
and the matrix element is 
Mg = Jt. we (7.14) 


This is shown in Fig. 7.4(c). 

Another example of a current x field interaction is the electromagnetic 
interaction. The equivalent current for an electromagnetic interaction 
(e.g. for an electron, with negative charge) is 


va = —u(e)y,u(e) (7.15) 
and the electromagnetic field is A“, giving® Mg = JEM . AP. 

This discussion in terms of currents and fields is reminiscent of elec- 
tric currents and magnetic fields—a quantity called a magnetic field 
is constructed at all points in space to describe the action of all the 
bits of current that are flowing in all circuits nearby. This can then 
be used to determine the force on a particular piece of current placed 
somewhere”. 
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7.3 Weak interactions including quarks 


Having understood the V—A structure of the weak interaction in the 
lepton sector, we are now ready to extend the idea of ‘universality’ to 
the strong sector and quarks. We shall see that to maintain universal- 
ity it will be necessary to accept that the quark eigenstates involved in 
the weak interaction are linear combinations of those involved in strong 
interactions. This requires the introduction of Cabibbo theory and its 
extension to the CKM matrix. The result is a simple picture that ex- 
plains a large number of experimental measurements. The background 
to the problem almost predates accelerator particle physics, going back 
to a time when the highest-energy data came from cosmic-ray physics. 
It was noted that some of the particles behaved in a strange way—after 
correction for phase space, they seemed to decay about a factor of 20 


more slowly than those that were ‘not strange’.® 8 The ‘slow ones’ turn out to contain a 
strange quark—hence the quark name. 


7.3.1 Cabibbo theory 


Figure 7.5(a) shows the quarks arranged in families with the possible 
charged-current transitions marked (thicker lines denote the transitions 
with the highest probabilities). In contrast, Fig. 7.5(b) shows the lepton (a) 
families. Extra transitions are required to convert quarks of different 
generations, which do not exist in the interactions of the leptons. If this 
were not the case, the K and A particles would be completely stable 
(as would some particles with b quarks). Cabibbo developed his theory 


when only the u, d, and s quarks were known. The theory states: (b) 
Ve Vu v 
The charged-current interactions of quarks proceed with the same | | | 
coupling constant gw as for leptonic interactions provided we as- e u T 


sociate a factor cos 0c to u + d transitions and a factor sin 0c to 
u + s transitions. 


Fig. 7.5 Allowed charged-current 
interactions of quarks (a) 


With this scheme, a large number of weak interactions and decays are and leptons (b). 
correctly predicted; the value of 0c required is 13.1°. 
However, a remaining problem was the decay K} —> u* u7, a second- 
order weak interaction requiring two W particles in the lowest-order 
Feynman diagram. The predicted rate was many orders of magnitude 
faster than experimentally measured. A solution to this paradox was 
suggested by Glashow, Iliopoulos, and Maiani (GIM) [78]. 


7.3.2 GIM mechanism, flavour-changing neutral 
currents 


The GIM mechanism introduced a fourth quark c into the theory,’ 

with couplings cos 8c for c + s transitions and — sin ĝc for c 4 d. This This was in 1970, four years before the 
produces a second Feynman diagram in the process K} > utu, which J/% was discovered. 

interferes destructively with the one involving only u, d, and s quarks 

and suppresses the decay to a level compatible with experiment (see 
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(a) 
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Fig. 7.6 Allowed charged-current 
interactions with Cabibbo-rotated 
quarks (a) and leptons (b). 


10 Flavour-changing neutral currents 
are allowed at higher order and lead 
to the very important flavour oscilla- 
tions such as K} > K°, which we will 
examine in Chapter 10. 


[80, p. 327] or [84, p. 282]), provided the mass difference Me — my is 
not too large. 

With four quarks, the GIM mechanism gives a nice conceptual picture 
of what is going on. If we define a linear supposition of quarks (of flavour 
eigenstates d and s) to form d’ and s’, 


d \ _ / cos@c sin@ d 
($) a Ee A, © (7.16) 


then it is possible to view weak charged-current interactions of quarks as 
occurring within generations with the same coupling gw as for leptons, 
provided we consider the rotated states d’ and s’ rather than d and s. 
This is shown in Fig. 7.6(a) (ignore the t and b quarks in the diagrams 
for the moment). 

The original motivation for the GIM mechanism was to suppress 
weak interactions that change flavour but not the quark charge (flavour- 
changing neutral currents), such as s + d, which are only observed at 
very low rates. The GIM mechanism will show that these interactions are 
forbidden at tree level and it also ensures that the rates are suppressed 
at higher order in perturbation theory. 

The idea is as follows. The Cabibbo rotation in eqn 7.16 shows that 
we have to consider weak charged-current interaction as being between 
quarks u + d’ and c + s' and that these form two families that do not 
mix at all, just like the leptons (ignoring neutrino oscillations). There- 
fore, for the weak neutral-current interaction, we can hypothesize that 
the same thing happens, i.e. that the quark families do not mix at all 
in the rotated basis; the only interactions allowed are u + u, d & d’, 
sO s'andcve. 

The question is: restricting ourselves to the above hypothesis, are 
flavour-changing interactions of the neutral current, i.e. d <> s allowed? 
If so, then the overlap between d and s should not be zero. Take the 
rotation defined in eqn 7.16, invert it, and use it to give 


(ds) = (d' cos0c — s' sin 0c) (d' sin 0c + 8’ cos 0c) 
= (d'd! — s's') cos Oc sin Oc + d's’ cos? Oc — s'd' sin? 9g (7.17) 
=0 


where the last line comes from the hypothesis that in the new basis, 
there are no interactions between quark families, i.e. d'd’ = s's! = 1 and 
d's = s'd' = 0. So lowest-order flavour-changing neutral currents are 
not possible under the GIM mechanism. This then agrees with experi- 
ment, since flavour-changing neutral currents are observed to be highly 
suppressed. !° 


7.3.3 CKM matrix 


As hinted by including t and b quarks in Figs. 7.5 and 7.6, the scheme 
can be extended to three quark generations with a 3 x 3 matrix, which 
was proposed by Kobayashi and Maskawa (the resulting mixing matrix 
V is known as the CKM matrix, where C = Cabibbo): 
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d’ Vaa- Ves Vig d d 
s | =| Va Ves Væ s]=V{s (7.18) 
b' Via Vis Vio b b 


The matrix has values that come from experiment. The magnitudes (we 
will consider the phases later) are the magnitudes (...) are as follows:'! 


0.97427 + 0.00015 0.22534 + 0.00065 0.003510 poeta 


V = | 0.22520 + 0.00065 0.97344 + 0.00016 0:0412% 0 000E 
Q:008677 5 aon: 0.0404+9:9011 0.99914619000021 
(7.19) 


We will review some of the techniques used to measure the elements 
of the CKM matrix in Chapter 8. Note that the bottom row and right 
column are rather close to being all zero or one, i.e. the third generation 
does not mix much with the other two. This is the reason why the decays 
of particles involving b quarks are very slow. 

Kobayashi and Maskawa were originally searching for an excuse to 
allow a non-trivial complex number into the 2 x 2 mixing matrix— 
they realized!? that this is a way of inserting (at that time recently 
discovered) CP violation into the theory. Because it is possible to add 
a phase to each quark without altering the theory (measurable quan- 
tities are proportional to |M|?), this could not be done with a 2 x 2 
matrix. If the number of quark generations is extended to 3, the ma- 
trix can have one non-trivial CP violation-generating phase. Much of 
what is known as ‘heavy-flavour physics’ revolves around pinning down 
the values of the CKM matrix elements, since they are key param- 
eters in calculating the decay rates of heavy mesons and baryons. In 
addition to measurements depending directly on the matrix elements, 
the CKM matrix must also respect the constraint of unitarity. Al- 
though the number of parameters to be determined is not huge, to take 
proper account of the mathematical constraints and to include system- 
atic and statistical errors correctly is non-trivial and beyond the scope 
of this text. 

It is conventional to treat the charge —} quarks (d,s,b) as the ones 
that mix to (d’, s’, b’), while the others (u, c, t) do not change. This could 
have been done the other way round (or even considering a combination 
of rotations of both the charge -4 and charge +2 quarks), but it can 
be shown to simplify to the same combinations as used here. 

Assuming that the CKM matrix is unitary, it can be parameterized 
in terms of three independent mixing angles 012, 023, 613 and the one 
complex phase 6 as discussed above. A popular way to parameterize the 
matrix, following the Particle Data Group (PDG), is as follows: 

C12C13 $12C13 s1367? 
V = | —512C23 — €12823813¢? C12C23 — 8128238136 S23C13 (7.20) 
5812523 — €12€23813e° —€12523 — 812€23813¢° €23€13 


llfor the most recent values consult 
the PDG Tables [115]. 


12 For which they won the 2008 Nobel 
Prize. 
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Fig. 7.7 Spectator quark diagram for 
K? — ntr decay. 


13Which will subsequently decay to 
mesons with an s quark. 


Here cj; = cos6;; and sij = sin6;;, where 7,7 = 1,2,3 are generation 
labels. In the limit 623 = 013 = 0, the third generation decouples from 
the first two and 6:2 = c (the Cabibbo angle). The phase 6 allows 
for CP violation in the standard model. C'P violation will be covered 
further in Chapter 10. 


7.3.4 Decays of hadrons containing heavy quarks 


We are now in a position to try to make predictions of how weak decays 
of mesons and baryons containing heavy quarks might proceed. A useful 
concept is that of the ‘spectator quark’. Normally during the existence 
of a meson, gluon exchange occurs continuously between the q-q’ pair, 
so the strong force is important in understanding what is happening. 
The idea is that when the hadron decays, one of its constituent quarks 
changes flavour, emitting a virtual W particle—while this is happening, 
the other quark(s) are ‘spectators’, i.e. they play no role in the weak 
interaction itself. Although this is clearly not a rigorous result, it does 
provide a useful approximate model. An example is shown in Fig. 7.7, 
where the d is a spectator quark during the decay K? > rtro. 

From inspecting the CKM matrix in eqn 7.19, we can see that since 
Vos is large, a c quark within a meson is going to decay preferentially 
to an s quark. For example, the decay Dt — K + anything should 
have a high branching ratio, and indeed such decays dominate: K+ + 
anything 28%; K°/K®° + anything 61%; K~ + anything 5.5%. Another 
indication that the spectator model is correct would be if the lifetime of 
a charged D meson were the same as the lifetime of a neutral D. This 
is not quite the case: r( D+) = 1.040 ps and 7(D°) = 0.410 ps. However, 
the spectator-quark model ignores some other effects: the D° has more 
annihilation diagrams than the D* and non-perturbative effects from 
the strong interaction are still significant. 

The B mesons are expected to decay mostly to particles involving 
charm,! because Va is about a factor of 10 bigger than Vu», and this 
is indeed the case. Since neither Ve or Vy» is very big, the b quark 
decays relatively slowly. The measured lifetimes are 7(B*) = 1.67 ps 
and 7(B°) = 1.54ps, so the spectator-quark model prediction is much 
better. The larger mass of B mesons means that as(mp) is smaller and 
perturbative effects are less important (this is a consequence of ‘running’ 
coupling constants—see Chapter 9). 

Looking again at the CKM matrix, we see that Vi, is nearly 1. This 
means that the t quark decays nearly always to the b quark and not 
directly to s or d quarks. Since it is very heavy, the phase space is very 
large and it will decay rapidly—indeed too rapidly for any meson to be 
formed. 

A brief reminder about the naming of heavy mesons: the letter B or D 
with no subscript means the meson contains a heavy quark and either a 
u or a d in order to make up the charge of the meson. More exotic mesons 
are denoted with a subscript that indicates the less massive quark, e.g. 
Bs, Ds, or even Be. 
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7.4 Introduction to electroweak unification 


In Sections 7.2 and 7.3, we have characterized the charged-current weak 
interaction of both quarks and leptons as follows: 


e The interaction proceeds with a coupling constant gw for all weak 
charged-current processes. Gp / V2 = g2 /8 M5. 

e The quark flavour eigenstates are rotated through the Cabibbo 
angle 8c ~ 13.1° (or for six quarks by the 3 x 3 CKM matrix). 


e The ‘vertex factor’ is found to be (e.g. for u > d) GwVuad,, , where 
the weak charged current Jg = ü(u)y 4 (1 — ys)u(d) and Vua is 
the appropriate element of the CKM matrix. 


By comparing the properties of the electromagnetic (EM) and weak 
forces, it becomes apparent that these have some very similar prop- 
erties. Perhaps it might be possible to unify them, i.e. to provide a 
theory that covers both forces as two aspects of a more complete the- 
ory. This was achieved by Glashow, Salam, and Weinberg (GSW) and 
we will go through their arguments here. The theory was constructed 
before neutral currents had been discovered, and indeed it predicted 
the properties of neutral currents, which were subsequently experi- 
mentally verified. We will start with just the EM and weak charged- 
current interactions and see how the neutral current emerges from the 
theory. 

First, in Table 7.3, we list the properties of the forces more carefully. 
Although the forces have similarities, they are clearly different. The pro- 
cedure used by GSW to unify the forces is divided into four steps, since 
it is complicated to explain. We will use the r™ — v,e~v decay from 
Section 7.2.2 as an example. 


7.4.1 Electroweak unification procedure 


The process of electroweak unification starts with the components of 
the weak interaction as they were known around 1964 (i.e. only the 
postulated W=, and no neutral weak interaction), and the success of the 
Feynman diagram view of perturbation theory for QED calculations. In 
exactly the same way that local gauge invariance in the Dirac equation 
leads to an additional spin-1 field (the photon) and the correct photon— 
electron interaction (see Section 6.5), we use the same mechanism to 
insert the weak charged-current interaction of the fermions by a spin-1 
boson (the W*).!4 We assume the V—A structure from Section 7.2.4 
and that the spin of the W is 1, as guided by experiment, so the current 
is as in Section 7.2.6: 


Ii = TOE — qs)u(T) (7.21) 


MWe ignore for now the fact that 
the local gauge invariance mechanism 
only works with massless particles like 
photons—we will return to this later. 


194 Weak interactions 


EM interaction 


Weak CC interaction 


Maxwell’s equations 
Long range 


y = massless spin-1 boson 


1 Q1Q2 


Areo r? 


Acts on particles depending on their charge 


Conserves parity 


Comes from the Dirac equation when insisting 
on local gauge invariance 


J z ME PYY 
where Q is the electric charge in units in which 
the electron has Q = —1 


Coupling constant e = y 4ra 


e.g. Ven + pe, H7 > € DeVp, K? s etr ve 


Short range 


Massive W= spin-1 bosons 
Propagator —,——~ 
Acts on left-handed particles and right-handed 
antiparticles only 


Not parity-conserving 


Ju = 5 (1 — y 


Coupling constant gy for all weak interactions 


(universality); use CKM matrix elements with quarks 


Table 7.3 Comparison of properties of the electromagnetic and weak charged-current interactions. 


Step 1: Left-handed particles 


Since PL = $(1 — 75) is a projection operator, we can operate on a spinor 


Particles Antiparticles 
1 1 

ur = > (1—7s)u w= 5(1+75)v 
2 2 
1 il 

UR ~(14+5)u UR = =(1—s)v 
2 2 

2 _l _ _1 

üL =z (1+7) t=95(1— 7) 

2 2 
_ _1 = al 
tr = U5 (1— 75) Tr = U5 (1+ 45) 


R, L correspond to helicity +1, —1 
if m=O and approximately if m~0. 
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where in line 7.23, we have used the identity yp7¥5 + Ysu = 0. 


with it twice in succession without changing the effect, so $(1—5)u(7) = 
5(1 —45)5(1 — ys)u(T), and 


(7.22) 


(7.23) 


(7.24) 


(7.25) 


Line 7.24 is the main part of step 1 towards unification—we associate 


Table 7.4 Spinors of particles and 


antiparticles. 


the parts with the (14 


ts) with the spinors rather than the operator. By 
doing so, we are left with an operator y, that looks like the EM operator. 


This looks more obvious in line 7.25, where we write the left-handed 
parts of the spinors directly. Table 7.4 shows the correspondence between 
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left- and right-handed spinors and adjoint spinors for both particles and 
antiparticles. 


W= vertex factors are the same as EM if we act only on the 
left-handed part of the spinor. 


Interlude: weak isospin 


We now add some structure to formalize step 1 by adding a quantity I3 
that plays an analogous role for the weak interaction as the charge Q does 
for the EM interaction. We categorize the particles—indeed the separate 
left- and right-handed parts!°—according to how they interact weakly. 
We define a quantity called weak isospin in a mathematically analogous 
way to isospin (and ordinary spin), and arrange the left-handed parts of 
the particles in J = $ doublets such as (vy, 4L) or (ui, di) containing 
the particles that can change into each other at a weak CC vertex. 
The member of each multiplet with the more positive charge is assigned 
I3 = +z and the member with the more negative charge is assigned 


I 


T G 
l T t 
5 5 VeL Vul Vru UL CL L 
1 1 

5 7p | el OT ad, st, b 


The right-handed parts of all the particles are assigned to weak isospin 
singlets with J = 0 and I3 = 0. 


T 5 
0 0 


/ $ / 
eRe UR TR UR CR tR dp Sk dR 


The right-handed neutrinos have been left out of the table—their na- 
ture is still being investigated. They also have J3 = 0. It is possible 
that neutrinos are Majorana particles (see Chapter 11), i.e. they are 
their own antiparticles (in which case, flipping the helicity of a neutrino 
vy, produces the antineutrino vp; neutrinoless double 8 decay becomes 
possible), or they could be Dirac particles in which neutrino and an- 
tineutrino are distinct, and therefore the vR and Pg do not interact 
with any known force. If neutrinos had been exactly massless, these two 
situations would be experimentally indistinguishable. 


The Z label, which is +4, —ż4, or 0 is used for the weak charged- 
current interaction in a similar way to the charge in an electromag- 
netic interaction—i.e. when it is 0, there is no interaction. 
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15The left- and right-handed parts are 
not particles in their own right. An elec- 
tron spinor u(e) describes the real elec- 
tron; we split it up as u(e) = u(eL) + 
u(eR) simply to make the similarity be- 
tween the EM and weak interactions 
apparent in the formalism. 
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Step 2: W° 


We next appeal to symmetry, and postulate a neutral partner W° to 
the W=. This interacts producing no change in charge within each 
doublet. 


e It has the same coupling constant gw. 
e It interacts only with the left-handed states. 


This is not the Z°, which also interacts with right-handed states of 
quarks and leptons. 


Step 3: B® 


Instead of the electromagnetic interaction, we introduce another field, 
the B°, which will ensure the correct electromagnetic interaction in 
the following step. The B° interacts with a strength proportional to 
a quantity called weak hypercharge Y, where Y is defined by 


1 
Q=b+5Y (7.26) 
The B° interacts with a current J: 
JY =a -2J (7.27) 


We give the B° interaction its own coupling constant g'/2 (the factor of 
2 is just a convention). Examples of B° currents involving left-handed 
up quarks and right-handed electrons are 


Ji = U(ur)yY (ur)u(ur), — U(er)wY(er)u(er) (7.28) 


where Y (uL) = +4 and Y(er) = —2 using eqn 7.26. Table 7.6 at the 
end of this section gives Y for all the particles. 


Step 4: The Weinberg angle 


We will use the notation introduced in Section 7.2.6 with currents and 
fields. Table 7.5 gives a summary of the symbols used, the same as those 
above with the addition of u indices on the currents and fields. Since 
all the fields are vector or axial vector fields, they include an index 
u = 0,1, 2,3. When a current and a field are combined to form a matrix 
element, there is an implied summation over p. 

The electromagnetic interaction A” is a linear combination of W“:° 
and B”, and the orthogonal combination produces a new interaction, 
which is the weak neutral current Z°. A convenient way to form the 
linear combination is with a rotation angle. We introduce a new rotation 
angle Ow, the ‘weak mixing angle’ or ‘Weinberg angle’. 


A! = +B" cos Ow + W sin Ow (7.29) 
Z" = — B" sin dw + W" cos Ow 
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Current Field Coupling ‘Charge’ 


constant 
Charge-raising weak IE Wet gy Ts 
Charge-lowering weak Ja wee Iw Tz 
Symmetric W° field a we? Jw Tz 
B° field (step 3) Ji BH g' /2 ¥ 
Electromagnetic J ria Al e Q 


Neutral current J n c Zt 


Table 7.5 Summary of fields discussed in this section and the symbols used. 


We now write down the total GSW electroweak interaction in the form 
of a (current),,(field)” Lorentz scalar:16 


Wet Wien J! 
ae | = 0 1,0 Y 
(1 I y+ IW )+ Son Be (7.30) 


The factors of /2 come from group theory. Now, we invert eqn 7.29, 


BY = +A" cos Ow — Z* sin Ow (7.31) 
W# = +A" sin dw + Z”? cos Ow (7.32) 


and write the neutral part of eqn 7.30 in terms of A“ and Z by 
substituting in from eqns 7.31 and 7.32: 


1 yy 

Oye I 7Y pw — . 0 i 
IwJ W" + 3 B! = (a sin Ow J + g' cos Ow S Alt 
yy (7.33) 
+ (4. cost a — g' sin Ow | Ze 


Recovering the EM interaction 


Consider next the two parts of the right-hand side of eqn 7.33 separately. 

The first part must be set equal to eJ ETAF, otherwise the GSW theory 

will not reproduce the EM physics described by QED. From eqn 7.27, 
EM _ 1 7Y 

J = Nhe + 5J,, and so 

Y 


1 
(as sin Ow J; + g’ cos Ow K) A" =e G + J) A“ (7.34) 


164 more detailed treatment con- 
structs the W+ and W7 in a way that 
makes the symmetry with W° more 
apparent—see Further Reading. 
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For this to be satisfied, we must set 


e = gw sin Ow = g' cos Ow (7.35) 


These two equalities are called the ‘unification condition’. 


The neutral-current interaction 


The second part of eqn 7.33 is the weak neutral current Gade . This is 
completely specified (i.e. there are no free parameters) by the charged 
weak current and electromagnetism: 


Y 
gala = gw cos Ow J) — g' sin Ow T (7.36) 


We now manipulate this expression to remove the parts involving B°. 
Using eqn 7.27 and the unification condition 7.35, we have 


Z_ Iw 2 0 . 2 EM 0 
925, = ar [cos Ow Ja — sinf Aw (Jn — Ja) 
= ae (J? — sin? Oy JEM) (7.37) 


We can now put explicit forms of JE = Izūy, (1 — y5)u and JEn = 
Quy,u into this expression to give 


Z Jw 1 - 2 
925, = amy] 50 5 )13— sin awg] u 
gw — |1 eg 1 1 
= 1 I 0 1 + —(14 
Son] 5 Ys- sin” Ow Q (5 ys) zl Ys) || u 

a Nt age ta. ae) (7.38) 

= cos Oy UY M5 5) 41 IRS Y5)| u : 
where gy, = Iz — Qsin? Ow and gr = -Q sin? Ow are the couplings of 


the left- and right-handed particles, respectively. Note that, apart from 
neutrinos for which Q = 0 and so gr = 0, the neutral current interacts 
with both the left- and right-handed states of the particle but with 
different strengths. In contrast, the charged-current interaction involves 
only the left-handed states of all particles. 

It is also possible to rearrange eqn 7.38 to define coupling constants 
cy and cą in terms of g, and gr: 


Z. w _ 1 1 
G25, = Perr aL 5 (9 + gr) 5 (I gr) ¥5| u (7.39) 
gw _ 1 
—= A 
ae üuz (ev — cars)u (7.40) 
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Fermion (f) I k Q Y cf, ch 
1 1 
+5 0 -l += += 
Cr) GE) Ca) a a 
ey jJ’ EJ N Ti 2 1 1 
ü 1 1 + 2sin? Ow = 
VeR, VYuR YTR 0 0 0 0 
ER, HR; TR 0 —1 —2 
b d al glota al 
H } t in = 
2) 6 ii i “3 “3 “a “a 3 wo a 
dy)? \st/’ i 2 1 1 1 1 2 1 
x 29 a! 
2 3 eR Brg ew = 
2 4 
t ô O +2 4- 
UR, CR, ÎR +3 3 
1 2 
dR, SR, OR 0 0 -3 73 


Table 7.6 The electroweak properties associated with each group of fermions. 


where cy = Is — 2Q sin? Ow and ca = Is. Different applications prefer to 
use either gL, gr or cy, ca.!” The values of cl, and ch for each fermion 
f along with other properties of the fermions are shown in Table 7.6. 


7.4.2 Weak neutral currents 


The existence of weak neutral currents (see Section 8.3) was the first 
critical prediction of the unified electroweak theory. The amplitude for 
Dye > Dpe is 


2 
i Jw a(n (v) (v) T 
M = —, = 
8M3 cos? Ow (aa) Yu (cy CA 15) u(7,)] (7.41) 
x [ace (A — eo? )u(e)] 
We can look up the couplings from Table 7.6, which are ay _ Q = Z, 
£ =—4 + 2sin? Ow, and ce) = —4, and so 
I 1 
M= z iP) yu = (1 — ys)u(? 
a aay Pal- w9)ue| 
(7.42) 


x [acey (25i aw = 30 = 2) ) ule) 


17 For simplicity, we can use the value 
of Iz for the left-handed particle in 
these equations all the time; if we have 
a right-handed component to the par- 
ticle, we can write it as 4 + y5)u, 
and when combined with the (1 — y5) 
in eqn 7.38, it causes the term with I3 
in it to be zero. 
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Fig. 7.8 Radiative corrections to the 


W and Z masses from top-quark loops. 


Fig. 7.9 Radiative corrections to the 
W mass from Higgs loops. Equivalent 
diagrams also apply to the Z. 


Note that the current involving the neutrino takes the same $(1 — 75) 
form as the charged current. What is special about the neutrino that 
causes this? It is the fact that it is electrically neutral, so all of its 
interaction comes from the W°, with none from the B°. 

The calculation of the cross section involves averaging over initial spins 
and summing over final spins. It can be done with several time-saving 


tricks (see Further Reading). The result for do (p,e > D e)/dEe is 
do _ Gime (cf -QY 4 (k +y j Ze 
dE. Qn v A X E E, 


Rey- 


This can be integrated to give the cross section 


(7.43) 


d 
dE. ~ E, x 107% m? 


g = 


(where E, is in GeV)—a very small cross-section! The neutral current is 
also observable in neutrino—nucleon collisions. The cross sections are still 
small, but somewhat larger than for scattering off electrons. The obser- 
vation and precision measurements of neutral currents will be discussed 
in Chapter 8. 


7.4.3 Masses of W and Z bosons 


We can now use the electroweak unification theory to predict the masses 
of the W and Z bosons in terms of the weak mixing angle sin 0w and 
the Fermi coupling constant Gr. We start with the relation between Gp 
and Mw, eqn 7.5, and, substituting for gw from eqn 7.35, we find 


GF = e? 
V2 8M2, sin? Ow 


1/2 
Je? 1 
Mens (se 


44 
8Gr sin Ow ee) 


Using the unification condition 7.35 again, we can simply relate the 
masses of the W and Z bosons in terms of the weak mixing angle: 


Mw 


u (7.45) 


= cos Ow 


Therefore, if we have measurements of sin?@w and Gr from low-energy 
experiments, we can predict the masses of the W and the Z bosons. 
Hence the discovery of the W and the Z bosons (see Chapter 8) at the 
expected masses was a triumph for the electroweak theory. 
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The above discussion of the W and Z bosons is valid at lowest order 
in perturbation theory. There are, however, small but important radia- 
tive corrections Ar from higher-order diagrams. The results of these 
corrections are parameterized by modifying eqn 7.44 to 


M2, = vee 
W 8Gp sin? Ow (1 — Ar) 


There are contributions to Ar from fermion loops containing t quarks (in 
principle, other quarks contribute, but the t quark is dominant because 
of its much larger mass) as shown in Fig. 7.8 on page 200. 

These give a contribution 


(7.46) 


3Gpm? Cy 
8V 21? sey 
where we define sy = sin@w and cj, = 1 — s%, . The masses of the W 


and Z are also affected by Higgs loops (see Fig. 7.9 on page 200). The 
contribution to Ar is given by 


(Ar )top = 


(7.47) 


11GpM3ch, mi, 
N72 
24/21? M3 


Ar (eqn 7.46) is the sum of the virtual top and Higgs loop corrections 
to Mw and Mz as well as the running of the fine structure constant (a) 
from low energy to the value at Mz. The effect of the running of a is 
given by 


(Ar Higgs = 


(7.48) 


oro =1- a/a(Mz) 


Where a is the value of the fine structure constant at low energy and 
a(Mz) is the value at the scale Q? =M?#. The overall sum of these 
radiative corrections is given by 


Ar = oro + Artop + ArHiggs 


Therefore, precision measurements of My, Mz, and Gp make predic- 
tions for the allowed values of the masses of the top quark and the Higgs 
boson. Note that the contribution from the Higgs depends logarithmic- 
ally on my, whereas the contribution from the top quark scales as m?. 
Therefore, even with no knowledge of the Higgs mass other than that im- 
posed by unitarity, precision electroweak measurements including My 
and Mz predicted the mass of the top quark to be around 170 GeV (see 
Chapter 8). 


7.4.4 The standard model, how good is it? 
We now look at a list of some of the difficulties with the weak interaction: 


(1) The original Fermi 4-point theory had a problem, the cross section 
o x GpE?, where E is the centre-of-mass energy. This is fine at low 
energy, but at 300 GeV the scattering probability becomes bigger 
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(a) e7 


(b) e7 
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Fig. 7.10 Antineutrino—electron 
scattering in the original Fermi 4-point 
theory (a) and including the W 
intermediate boson (b). 


~~ 
x 


Fig. 7.11 vi — WtW- t-channel, 
electron exchange (a) and 
neutral-current, Z exchange (b). 
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(a) et wt 
Z 
e wW- 
(b) e* w+ 
y 
e w- 
(c) et wr 
Ve 
EF w- 


Fig. 7.12 The three diagrams 
contributing to ete~ > WtW-, via 
Z exchange (a), y exchange (b), and 

t-channel neutrino exchange (c). 


Fig. 7.13 Higher-order Feynman 
diagram with an e+e loop for the 
process ete~ + ete-. 


than 1. This is known as unitarity violation and is bad news for 
a theory. By introducing the W boson and moving away from a 
four-point interaction (Fig. 7.10), a propagator term 1/(Mj, — q°) 
is introduced and the cross section stops increasing. 

Another similar problem occurs even with W bosons in the the- 
ory. The process vy — WtW- in the theory before electroweak 
unification (Fig. 7.11(a)) is divergent (i.e. becomes very large 
at high energy). Electroweak theory predicts neutral currents, 
i.e. diagrams including the Z, so we now have another possible 
diagram (Fig. 7.11(b)) involving Z > WW, which cancels the 
divergence. 


Also, inete~ — WWT, the two charged-current diagrams shown 
in Fig. 7.12 need cancellation from the third diagram, which comes 
from the combined electroweak theory. 


Renormalization We have not gone into full detail about calcu- 
lating Feynman diagrams. Higher-order diagrams contain internal 
loops. The Feynman rules require that we integrate over the par- 
ticle momentum in each loop. The diagram in Fig. 7.13 presents a 
problem in that the integral (which we want to evaluate with an 
upper limit of infinity) is 


œ% ] 
J aT da= 1n |a] 


which is divergent. The solution, renormalization theory, took a 
long time to develop and to be shown to work. Eventually, this 
task was completed by ’t Hooft and Veltman, who showed that all 
locally gauge-invariant theories are renormalizable [130]. 


(7.49) 


The theory of the EM interaction (QED) is locally gauge-invariant 
and we can use ’t Hooft’s theorem to assure ourselves that it is 
renormalizable. The same is true for QCD. 


For the weak interaction, however, we have a problem. The bosons 
are massive and enter in a different (non-locally gauge-invariant) 
way to the EM interaction in the Dirac equation. Now ’t Hooft’s 
theorem does not apply. This is where the Higgs mechanism for 
providing particle masses is crucial. The theory is outlined and 
results on the Higgs from the LHC are given in Chapter 12. 


Chapter summary 


e All leptonic charged-current weak interactions are described with one 


coupling constant gw. 


e The weak interaction is determined from experiment to be V—A, 


i.e. Ya (1 — 45). 
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e The universality of the weak interaction with coupling gw also holds for 
quarks, provided the CKM-rotated states (d’, s’, b’) are used for the 


charge -4 quarks. 


e Electroweak unification is outlined, with charged-current interactions 
mediated by the W~ and neutral-current interactions by the Z° and 


the photon (which remains massless). 


Further reading 


e Griffiths, D. (2008). Introduction to Elementary Par- 
ticles (2nd revised edn). Wiley-VCH. This gives a 
very clear introduction to Feynman rules and how to 
perform the calculations. 

e Burcham, W. E. and Jobes, M. (1994). Nuclear and 
Particle Physics. Pearson. This contains more ad- 
vanced discussion of the electroweak theory and a good 
introduction to renormalization theory. 

e Taylor J. C. (1979). Gauge Theories of Weak Inter- 
actions. Cambridge University Press. This gives a 


concise account of the quantum field theory and gauge 
symmetry underlying electroweak unification. 


Halzen, F. and Martin, A. D. (1984). Quarks and 
Leptons: An Introductory Course in Modern Particle 
Physics. Wiley. This is another very good graduate- 
level textbook explaining the theory of the Standard 
Model. 


Thompson, M. (2013). Modern Particle Physics. Cam- 
bridge University Press. This is another recent text- 
book covering the Standard Model well. 


Exercises 


(7.1) Verify the identity Yuys + Ysy% = 0. Hint: See 
chapter 6 for the definition of the y matrices. 

(7.2) Explain why in general the vector coupling con- 
stants Aff) depend on the weak mixing angle sin? 0w 
but the axial vector coupling constants a) do not. 
Why does cH ) for the neutrinos not depend on 
sin? Ow. 

(7.3) Show that uy u = Uryur + trypur. 

Hint: Show that uryut = 0 by adapting 
eqn 7.25 and then proceed backwards through the 
steps used to derive the original eqn 7.25; you 
should find a combination of projection operators 
that gives zero. 

(7.4) What are the possible decay modes of the 7T? 

Given that lifetime of the muon is 2 x 107° s, es- 
timate the expected lifetime of the 7? How might it 
be measured? 


Neglecting density-of-states factors, what is the 
expected ratio of branching ratios for 


Tt > Kt, 


SS Sas 
Tt > ntr 


Starting with an intense 800 GeV proton beam, how 
could a high-energy neutrino beam, enriched in v+, 
be produced? Explain the origin of reducible and 
irreducible backgrounds of other neutrino flavours. 


(7.5) Draw quark flow diagrams for the following decays 


and discuss the dependence of the decay rates on 
the matrix elements of the CKM matrix: 


a) pt et 4p + Ve 
Kt > p” +v, 


mE > ut ty, 
Dt > K7~4+nt4n7 
DK 4a" +r 
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(7.6) 


Weak interactions 


The A has a mean lifetime of 2.6 x 1071°s and de- 
cays into p+e + Pe with a branching fraction of 
8.3 x 1074. The At (ude) has a mean lifetime of 
2.1 x 107'’s. Estimate the branching fraction of 
At + A+et+v, and comment on how your result 
compares with the measured value. 

Im(At) = 2.285 GeV, BR(At > Aetve) = (2.14 
0.6)%.] 


Explain how each of the following measurements 

could be used to determine a single element of the 

CKM matrix or a combination of elements: 

(a) the ratio of -pair production to single-y pro- 
duction in v, interactions on nuclei; 


(b) the decay rate of DE > rTv,; 
(c) the decay rate of B? + D*t uvu; 


d) the region of the -momentum spectrum near 
8 H 
the kinematic endpoint from B > X pv, decays 
(where X is any hadronic final state). 


How are these results affected by the strong inter- 
action? 


(7.7) 


(7.9) 


Draw quark flow diagrams for the decays of the 
charm mesons D° and D*. In the ‘spectator’ model, 
the decay rate is only determined by the quark 
that changes flavour. Explain why this model works 
quite well for the equivalent beauty mesons but not 
so well for the charm mesons. 


Draw two Feynman diagrams for each of the de- 
cays D? + Ktn” and D? + K~x*. Estimate the 
ratio of the two decay rates based on the different 
CKM matrix elements and explain why the phase- 
space factors should be the same. Compare your 
prediction with the measured values in [115] and 
comment on the origin of any discrepancies. 


The matrix elements in Z decay are proportional 
to a factor cy — ca. Explain why the partial width 
for a given decay mode is proportional to a factor of 
c3, +c. The measured partial widths of the Z in de- 
cays to hadrons and to electrons are [}?4 =1744 + 
2.0MeV and Ty =83.92 + 0.12MeV. Determine 
the compatibility of the ratio of these two meas- 
urements with the SM and a value of sin? 0w = 
0.231 and explain any discrepancies between your 
prediction and the experimental value. 


Experimental tests 
of electroweak theory 


Chapter 7 covered the basic ideas of weak-interaction theory and the uni- 
fication of the electromagnetic and weak interactions. The weak charged- 
current interaction is expressed using a unique coupling constant gw for 
all leptonic vertices: (ve, e), (Vu, 4), (Vr, T). The coupling constant is 
also valid for charged-current interactions involving quarks using the 
weak quark eigenstates, (u, d’), (c, s’), (t, b'), which are a rotation of 
the flavour eigenstates. We showed how the absence of flavour-changing 
weak neutral currents is explained using the Glashow, Iliopoulos, and 
Maiani (GIM) mechanism. We gave an outline of how the weak and 
electromagnetic interactions are unified into a single consistent theory. 
The resulting electroweak theory includes only one additional parameter, 
sin Ow, and predicts all the features of the weak neutral current. 

In this chapter, we review some of the experimental evidence that 
underpins the electroweak theory. We start with some of the key neutrino 
experiments, then look at the discovery of neutral currents, as this was 
the first step towards a unified electroweak theory. The key prediction 
of the theory was the existence of the massive W and Z bosons, with 
quite precise estimates of their masses. Their discovery with masses in 
the predicted range was a triumph! Electroweak theory has now been 
probed to much higher precision by many experiments, particularly at 
LEP and the Tevatron. Many details have since been filled in, but the 
basic structure remains unchanged. 


8.1 Neutrinos 


When Pauli postulated the existence of neutrinos, he was afraid that the 
cross sections were so small that they would never be measurable. The 
experimental discovery of neutrinos was made possible by the intense 
flux of antineutrinos from nuclear reactors.! The reaction studied was 
Dep > etn, using water as the target. Each et annihilated with an e7 
to produce two photons. The photons were detected using tanks of li- 
quid scintillator viewed by photomultipliers. To reduce the backgrounds, 
cadmium chloride was added to the water. This allowed neutrons to be 
captured by n 1°8Cd— !°Cd-+y. The neutron capture happened a few 
microseconds after the first reaction. Therefore, a clean signal for the 
reaction was a flash of light followed by a delayed coincidence.” 


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg. © Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg 2016. Published in 2016 by Oxford University Press. 
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‘Later, high-energy proton accelerators 
were used to create neutrino beams at 
much higher energies (see Chapter 11). 


*The proof that the events were from 
the reactor as opposed to backgrounds 
like cosmic rays was provided by run- 


ning with the reactor off. 


206 Experimental tests of electroweak theory 


3The electron spins responsible for 
ferromagnetism are aligned antiparallel 
to the B field. 


4The steel plates used came from an old 
battleship. 


152Fy 


Fe 


Pb shield 


Fig. 8.1 Schematic of the experiment 
to measure the neutrino helicity. 


If detecting neutrinos was challenging, how could one measure the he- 
licity of a neutrino? The answer came in a brilliant experiment [79]. The 
key idea was to transfer the helicity of the neutrino to a photon, which 
could be measured relatively easily. The source of the neutrinos was elec- 
tron capture Eue~ > Sm* ve, with the subsequent decay Sm* + Sm y. 
As the energy is shared between the Sm and the y, in general the y does 
not have sufficient energy to be absorbed by another Sm nucleus. How- 
ever, if the Sm* decays with the y travelling in the direction of the Sm”, 
then it will have sufficient energy to make a resonant scatter off Sm (see 
Exercise 8.1). The photon must have the same helicity as the neutrino 
(see Exercise 8.1). The experimental apparatus is sketched in Fig. 8.1. 
Photons of the correct energy will be resonantly scattered by the ring of 
Sm203 and then detected by the Nal(T1) scintillator coupled to a photo- 
multiplier. The rate is measured with two polarities of the magnetic field. 
The scattering cross section for ySm is greater if the spin of the photon 
is anti-aligned with that of the iron than if it is aligned. The polariza- 
tion of the photons could thus be determined and hence the helicity of 
the neutrino also. It was found to be consistent with —1, as expected. 

The concept of lepton number was invented to explain the absence of 
decays u > ey. Neutrinos must also carry lepton number. The experi- 
mental demonstration [69] that ve are distinct from v, came from an 
experiment in which a v,, beam was fired at a 5000 ton steel wall* to 
absorb all particles other than neutrinos. The neutrinos then interacted 
in aluminium plates and the resulting charged particles were detected 
by spark chambers. Muons could be separated from electrons because of 
their longer range. The observation of muons coupled with the absence 
of electrons showed that v, were distinct from ve. 

When the 7 lepton was discovered, it was assumed that there would be 
an associated neutrino, v+. The experimental confirmation of the v, was 
made by the DONUT Collaboration [92]. Producing a v, beam is quite 
a challenge. The first step was to direct an intense beam of 800 GeV pro- 
tons at a tungsten target. The forward-going interaction products then 
entered a magnetic field, which swept charged particles aside, greatly en- 
hancing the neutrino content of the beam, including v, (from the decays 
of charm mesons, such as the Ds). As for other neutrinos, to identify a v+ 
it must first interact to produce a charged 7, which can then be detected 
through its charged-current interaction, v-X — TY. The short lifetime 
of the 7 leads to tracks with ‘kinks’ near the primary vertex. These kinks 
were identified in an emulsion chamber, but in order to determine which 
volume of the emulsion to measure, a magnetic spectrometer was used 
to identify candidate 7 events. Four such events were found, significantly 
above the background of 0.34 events. 


8.2 Charged currents 


The theory of charged currents developed in Chapter 7 is based on a 
V—A structure. This is parity-violating, and the first clear experimental 


observation of parity violation was in the decay °°Co(J? = 5+) > 
CONi* + e7 + ve. The Co nuclei were aligned by an external mag- 
netic field.” The rate of emission of electrons was found to be consistent 
with an angular distribution of the form 1 — (v/c) cos 0, where @ is the 
angle between the electron 3-momentum and the magnetic field dir- 
ection, as predicted by a V—A interaction. A more recent and more 
direct demonstration of parity violation is given by the angular dis- 
tribution of leptons from W decays (see Section 8.5.1). Very strong 
evidence for the V—A theory also comes from the ratio of branching 
ratios BR(x* > etv.)/BR(at > ptv,); see Exercise 6.4. The cleanest 
probe of the V—A theory comes from muon decays p — ePeVu. Using 
intense muon beams (from pion decays), very precise measurements of 
the electron spectrum from stopped muons agree with the V—A theory 
and place stringent limits on contributions from other interactions. 


8.2.1 Measurements of CKM matrix elements 


In Chapter 7, we outlined how the theory of charged-current weak inter- 
actions of leptons could be extended to quarks with a universal coupling 
strength. This required the CKM matrix to allow for the rotation be- 
tween the weak and mass eigenstates of the quarks. The theory gives 
no predictions for the values of these matrix elements, so they have to 
be determined experimentally. However, as multiple experiments can be 
performed to determine the same element of the CKM matrix, power- 
ful consistency checks of the theory can be performed. The CKM matrix 
should be unitary, so this gives additional constraints on the theory. This 
aspect will be developed in Chapter 10. We give a very brief outline here 
of some of the methods used to measure the CKM matrix elements.® 

Measurement of CKM matrix elements proceeds by measuring many 
different processes; for example V,,q is measured by comparing had- 
ronic beta decays with the decay of the muon, and V,,, is measured 
by comparing the decay K — rev with a non-strange decay. 

The ratio of Vus / Ve» can be determined from the muon spectrum in the 
decays b + cuv, and b > upv,. As the u quark is much lighter than 
the c quark, the spectrum of muons from b > u decays extend beyond 
the end of those from b + c. This enables a clean sample of muons from 
b + u to be identified despite the fact that the ratio Vup/Ves < 1. 

The ratio V.a/Vua can be measured by observing the rate of dimuon 
to single-muon production in charged-current neutrino interactions on 
hadrons. If the neutrinos are above threshold, they can produce either 
charm or up quarks (see Fig. 8.2). A known fraction of the events with 
a charm quark will result in the semimuonic decay of the charm quark 
(c + uX) and hence result in events with two muons, whereas the events 
in which an up quark was produced will result in events with single 
muons. As the Feynman diagrams are the same for the two processes, 
the difference in the rates is simply given by the ratio of the CKM 
elements Veg/Vua once BR(c + uX) has been accounted for.” 
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>This required adiabatic demagnetiza- 
tion to achieve sufficiently low temper- 
atures, 0.01 K. 


See the review from the Particle Data 
Group in Further Reading for a com- 
prehensive discussion. 


W 


d uoc 
Fig. 8.2 Feynman diagram for 


vd > uc(u). 


"We have assumed that the energy is 
far above threshold. 
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(a) v v 
Z 

eT e 

(b) re e- 
W 

eT Me 


Fig. 8.3 (a) ve” > ve”, for any 
neutrino flavour, via a neutral current, 
with Z exchange. (b) i.e > Due, for 
an electron neutrino, via a charged 
current, with W exchange. 


8.3 Neutral currents 


The neutral current was first discovered in the Gargamelle bubble cham- 
ber at CERN in the reaction P,e —> P,e, elastic scattering of D, off 
atomic electrons. The scattering reaction of either v, or P, off electrons 
is an unambiguous signal of a neutral current. Scattering of either ve 
or De is ambiguous since there is a charged-current diagram for each of 
these reactions (see Fig. 8.3). 

About 100 events were observed in total and, by measuring the cross 
section, a value of sin? 0w = 0.24+0.04 was obtained. This was the first 
success for the unified electroweak theory and it enabled the prediction 
of the masses of the W and Z bosons (see Section 8.5.1). 

Subsequent studies of weak neutral currents with neutrino beams 
used electronic detectors that allowed the accumulation of much lar- 
ger numbers of events. For example, the CHARM2 experiment used the 
neutral-current reactions Ve~ — Ve~ and Pe~ —> De~. The ratio of 
the cross sections at the same energy is given by (see Exercise 8.2) 


1 
1 — 4sin? Ow + ma sinf Ow 

R=3-—_,,_3_, (8.1) 
1 — 4sinf Ow + 16sin* Ow 


An intense beam of neutrinos was used and particular care was taken to 
reduce backgrounds. For example, consider a neutral-current interaction 
on a nucleus, v,N —> v,x°X. The photons from the 7° decay may 
pair-convert (y — e*e—) in the detector material, potentially faking the 
single-electron signature. However, the electron energy (Ee) and angle 
(@.) with respect to the neutrino beam are limited by (see Exercise 8.3) 


E.02 < 2me (8.2) 


Hence the quantity E.0? will be peaked at small values on top of a 
continuous background. This requires a neutrino detector with very good 
angular and energy resolution. The target was made from glass since this 
contains elements of relatively low atomic number and hence minimizes 
multiple scattering, which limits the angular resolution (see Chapter 4). 
The final result from the CHARM2 [133] experiment was sin? Ow = 
0.2324 + 0.0083. 


8.4 Physics at ete colliders 


e+e~ machines are the place of choice to study the Z°. The reasons are 


as follows: 
e et and e~ are point-like and the centre-of-mass (CMS) energy is 
fixed. 


e Unlike a hadron collider, there is no underlying event (see below) 
from the beam-particle spectator quarks not involved in the hard 
collision. 


e Positrons are much easier to produce than antiprotons. 


Because of the finite size of hadrons, an ‘underlying event’ in a hadron- 
hadron collider is a superposition of multiple quark or gluon collisions 
occurring in the same hadron-hadron collision as the ‘hard scatter’ 
of interest between hadron constituents. Usually, the underlying event 
consists of particles produced at small angles to the beam axis. 

The big disadvantage is that electrons and positrons in circular col- 
liders lose energy through synchrotron radiation.® All charged particles, 
when accelerated, will radiate energy at a rate x 1/m*. So, for electrons 
and protons in circular colliders of the same radius, the ratio of energy 
loss is (mp/Me)* ~ 1018. The alternative of using a linear e+ e~ collider 
is discussed in Chapter 13. 

A chain of accelerators? (see Table 8.1) was used to produce the et e~ 
beams and increase their energies up to the target 45GeV per beam 
in the LEP ring (Fig. 8.4). The chain started with linear accelerators 
(LINAC) and then progressed through three circular synchrotron ma- 
chines (PS, SPS, and LEP). Filling took about one hour, with beams 
accumulating in LEP at 20 GeV (two cycles every 14.48). The counter- 
circulating LEP beams were then accelerated from 20 GeV to 45 GeV 
and left ‘coasting’ (i.e. there was no further acceleration—the radiofre- 
quency cavities were used just to replace energy lost through synchrotron 
radiation) for about 8 hours, during which time the e*e~ beams collided 
at the positions of the four detectors at LEP: ALEPH, DELPHI, L3, and 
OPAL. 


8.4.1 Detailed look at the detectors 


The detectors were of a roughly cylindrical shape providing almost ‘47’ 
(steradian) coverage; i.e. they were sensitive to particles going in any 
direction from the interaction point (with no cracks or holes except for 
the beam pipe through the centre). This was very important for some 
of the analysis techniques we are going to discuss later. The OPAL de- 
tector is shown in Fig. 8.5 as a typical example. The LEP detectors 
were general-purpose collider detectors as described in Chapter 4. There 
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8 More details are given in Section 3.2.2. 


ISee Chapter 3 for an explanation of 
the need for a chain of accelerators be- 
tween the source and the high-energy 


ring. 
LINAC e~ - 200 MeV 
e* produced in converter 
(EM shower) 
LINAC et,e~ — 600MeV 
PS e~ — 3.5 GeV 
SPS e~ — 20 GeV 
LEP e™ — 45 GeV 
Table 8.1 Chain of accelerators 
for LEP. 


Fig. 8.4 Map of the accelerator com- 
plex at CERN and the four LEP de- 
tectors. From [17]. 
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Fig. 8.5 The OPAL detector. From [17]. 


were interesting differences between the detectors because of the differ- 
ent optimization strategies employed. For example, the L3 had a very 
high-resolution electromagnetic calorimeter based on BGO (bismuth ger- 
manium oxide). BGO is a dense crystal with a good scintillation yield 
and was used as a homogeneous calorimeter. However, the smaller size 
and lower value of the magnetic field meant that the charged-track reso- 
10This is another demonstration that lution was not as good as for the other LEP experiments.!° Crucial parts 
there is no perfect detector design. of all four of the LEP detectors were the vertex detectors, which used 
silicon as the active material. These provided high-resolution tracking 
close to the beam pipe, which enabled the identification of jets from b 
quarks using the relatively long lifetime of B mesons (see Chapter 4). 
We will now summarize the different types of events seen at LEP when 
running at the Z° resonance as a review of what particles do as they 
pass through material. What the detectors ‘see’ is visualized using event 
displays, examples of which are shown in Fig. 8.6. Event displays are 
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(a) (b) 


: 5 Te 1S 1K TV ST PA = TD TE 1S K W ST PA 

DELPHI Interactive Analysis 1 2 o0 o DELPHI Interactive Analysis 26 

Beam: 45.6 GeV Run: 39265 DAS: Sul Act ana o a o o 0) Beam: 45.6 GeV 54 DAS: 25-Aug-1991 YA" ea ea (0) a a o o 
" 14:21:08 o o 0 0 o o o0 " 13622 fo 0o 0 0 o o o o 

Proc: 4-May-1994 EVE: 4754 Soan: 2-Jun-1994 Deact O (2) (0) (4) o o (0) Pros: 1-Oct-19. Evt: 1417 scan: 19-Feb-1992 __[] Deact 


(c) (d) 
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Evt: 3018 
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Fig. 8.6 Four event types at LEP from the DELPHI experiment (a) ete7; (b) wtp; (c) T+77; (d) quark—antiquark pair. 
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Fig. 8.7 Main cuts used to separate 
the different classes of ete~ events. 
From [43]. 


important tools for checking that the detectors are functioning correctly, 
for pedagogic purposes, and for examining unusual events. 


e Z — ete : Two back-to-back showers in electromagnetic cal- 
orimeter with tracks leading towards them. Nothing in hadron 
calorimeter or muon detectors. 


e Z — wtp : Two back-to-back tracks leading all the way out to 
the muon detectors. 


e Z = TtT" : A variety of topologies that have low multiplicities 
and missing momentum (from the neutrino(s)). 


e Z — viv: Nothing (see Exercise 8.4). 


e Z — 2-jets, Z — qq: The quarks form ‘jets’-—groups of mesons 
and baryons collimated along the initial quark direction. Generally, 
it is not possible to tell what flavour the original quark had. 


e b-jets: For b (and c) quarks, using the silicon vertex detectors, it 
is possible to detect whether the particles all extrapolate back to 
the primary vertex or whether there is a secondary vertex where 
the b quark decayed. 


e Z — 3-jets, Z — qq: With a gluon ‘bremsstrahlung’ emitted 
from one of the quarks, also materializing as a jet. 


The event categories can be distinguished—in particular ete~ and pt u7 
by their distinctive two-particle topologies. A plot of the total invariant 
mass of all the particles in the event versus the number of charged par- 
ticles is shown in Fig. 8.7. The t*7~ events can be distinguished from 
2-jet events with these variables. 
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8.4.2 Aspects of a physics analysis 


What is written here is valid for the analysis of any high-energy physics 
data, but is described in the context of a LEP experiment. An analysis 
generally goes along the following lines: the experimenters first choose 
selection criteria (cuts), which select the desired events, the signal, such 
as requiring n tracks that look like muons, or tracks above a certain en- 
ergy, etc. This may involve reconstructing the mass of a combination 
of particles from the measured 4-momenta of the tracks and showers.!! 
Quite often there is background in the sample selected—where a dif- 
ferent physical process produces events that pass the same cuts. Some 
backgrounds are ‘irreducible’ in that they produce the same final-state 
particles as the signal; other backgrounds are ‘reducible’ in that they 
could be reduced by tighter cuts or with a better detector. 

Calibration concerns subtracting pedestals!? and measuring the 
gains and linearities of each channel, particularly for calorimeters. If the 
calibration is not done correctly, then the energy resolution of the calor- 
imeter suffers. The acceptance for the process we are studying has to 
be calculated. Acceptance is the probability of the particles in an event 
hitting a certain part of the detector and having a certain minimum 
energy or momentum.!? Accidentals or pile-up concerns the problem 
when two events occur at the same time (within detectable resolution) 
and the resulting combination gets into the data sample. Accidentals can 
also be a problem if a preceding event causes electronics to momentarily 
become inactive before being ready to measure a new pulse, or if the pre- 
ceding event causes a movement in the pedestals that causes the energy 
to be measured slightly wrongly. Accidentals were not a big problem in 
LEP experiments, because the rate of events was low. 


8.4.3 Monte Carlo simulation 


An essential tool for any analysis of data from a large particle physics 
detector is the Monte Carlo simulation computer program.'* The idea is 
to produce simulated events, using a random number generator at each 
point where a choice in what happens must be made. Both the physics 
process and the detector response are simulated. 

Figure 8.8 shows a block diagram of the steps involved from the choice 
of physics channel to be simulated through the generation of the detector 
response, then the digitization of the Monte Carlo ‘data’ in an identical 
format to that produced by the various detector components that make 
up the complete detector. The Monte Carlo data are then run through 
the complete ‘real’ data reconstruction chain. They are then available 
for analysis by the same analysis codes that are used for the real data— 
the only difference is that one knows what physics process was used to 
generate the events. 

First is the ‘physics simulator’: physics events are simulated starting 
from theoretical matrix elements for the basic quark and gluon scat- 
tering processes that could occur in a hard scatter. The output will be 


8.4 Physics at ete” colliders 213 


ll Sometimes we assume that the com- 
bination has a mass m = 0. 


12 Even if there is no genuine signal in a 
detector channel, the readout will still 
deliver a non-zero value. This value is 
called the ‘pedestal’, from the shape 
of the distribution. The pedestal value 
must be measured and subtracted from 
genuine signals. 


13 Usually for a 4r LEP detector, it was 
almost 100% however, this could have 
been reduced if cuts were made around 
any dead channels. 


l4Tnvented by Ulam and Metropolis at 
Los Alamos in the 1940s and named 
after the the city in Monaco where there 
is a big casino. 


Generate primary event 


Track particles in detector, 
record hits 


Digitization 
Simulated raw data 


Fig. 8.8 Monte Carlo flow diagram. 
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15 Detector simulation is very computer 
intensive and this is another aspect 
of modern high-energy physics where 
GRID computing is essential. 


the type and 4-momentum of quarks, gluons, and leptons produced. In 
some cases, for example ete~ —> rr and electromagnetic processes, 
the matrix elements can be calculated. The procedure is similar for hard 
quark and gluon scattering. The underlying event is produced from so- 
called phenomenological models that describe the observed properties 
of the small-angle scattering involved. The physics events can also be 
chosen to be all of one type if one is trying to work out the best way 
to select events from a particular physics source (e.g. top-quark produc- 
tion). Sometimes, it is useful to generate only a single type of particle, 
for example if one needs to know what type of signals it will produce in 
the different parts of the detector. 

Next is the ‘detector simulator’: this is usually the most difficult part 
to produce and requires a thorough understanding of how the detector 
components work.!° Random number simulation is used to decide how 
each particle proceeds as it passes through various types of material in 
the detector. For some particles, for example electromagnetic particles 
(e*,y), the response of most materials is well documented, and well- 
honed computer codes exist that can be modified to whatever geometry 
is required. For hadronic particles, things are more complicated, first 
because most of these particles are unstable and will decay either within 
the beam pipe or within the detector layers nearest to the beam. The 
decays have to be simulated and the resulting products (mostly pions, 
together with some kaons and nucleons) followed through the detector 
layers until they are absorbed. The exceptions are neutrinos (which will 
leave no signal in a typical collider detector) and high-energy muons, 
which will penetrate through the main detector and surrounding magnet 
and so require special and very large-area but fairly simple charged- 
particle detectors covering the outside of the main detector. The final 
step is to collect the simulated signals, format them as though they 
were real data, and add the ‘book keeping’ records. These steps are 
summarized in the remaining three boxes in the flow diagram in Fig. 8.8. 


Example 


As an example, in generating a simulated ete — Z° — rtro, the 
choices could be as follows: 


(1) Pick the direction along which the r* travels (using the ap- 
propriate angular distribution) and decays, then use momentum 
conservation to work out the direction of the T~. 

(2) Choose the decay time for each 7 by picking from an exponential 
probability distribution e~*/"7, where r, is the mean decay time of 
the 7 lepton, and work out where it decays. 

(3) Choose from a table of measured branching ratios what daughter 
particles each 7 will decay into. 

(4) Follow the decay products until they decay, are absorbed in the 
detector material, or exit (e.g. neutrinos). 


The technique works well and, by generating lots of simulated events, 
we are effectively performing a numerical integral over all the possible 
outcomes of what might happen by randomly sampling the integrand. 
The Monte Carlo technique is often used to estimate trigger efficiencies, 
acceptances, and accidental effects in an analysis. 

However, there are always systematic uncertainties associated with 
Monte Carlo calculations, particularly at hadron colliders. Therefore, 
wherever possible, the calculations should be done in a ‘data-driven’ 
way that does not rely so heavily on Monte Carlo calculations. This 
approach will be described in Chapter 13. 


Multivariate analysis methods 


The above description of an analysis is based on the simplest ‘cut-and- 
count’ approach in which one makes a series of selections on a sequence 
of variables and counts the number of events that pass all the selections. 
This approach is clearly not optimal if there are correlations between 
the variables that are different for the signal and background processes. 
Consider a toy example in which one is using two variables xı and x2 
to discriminate between signal and background. A cut-and-count ana- 
lysis would select a rectangular region in (#1, %2) space (see Fig. 8.9). 
However, a more powerful discrimination between the signal and the 
background might be obtained by selecting a ‘triangular’ region (see 
Fig. 8.9). In a typical analysis, we have to deal with many variables, 
so the optimization of the selection is non-trivial. Powerful statistical 
techniques like neural networks and boosted decision trees are used (see 
Behnke et al. in Further Reading). 


8.4.4 Physics at LEP 


LEP operation was in two phases: for LEP1, the CMS energy was close 
to the mass of the Z°. Similar physics was studied at the SLAC Lin- 
ear Collider (SLC), where the luminosity was much lower than at LEP, 
although SLC had the advantage of being able to produce longitudin- 
ally polarized electrons. In LEP2, the energy was increased to above the 
threshold for WtW- pair production. 

We will review selected LEP1 physics in this section and discuss 
LEP2 physics in Section 8.4.11. LEP produced a huge number of Z°s, 
~4.5 x 10° per detector. Z°s decay into almost every type of particle we 
know, so many things can be studied. Examples include the following: 


bb: Lifetime of b quark 
B°-B° mixing 
B? CP violation (this is done better at BaBar, Belle, 
CDF, and LHC) 

Ti Branching ratios (pre LEP, there was a crisis 
because X` BR > 100% !) 
Decay parameters (information on spins etc. gives 
information on W and Z currents) 
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x 


Fig. 8.9 Event selection in the 
(a1, £2) plane. 
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Jets: QCD tests and precision a, measurements 


Higgs, SUSY: Searches for production and decay of these particles 
and other similar exotica 


Others: Measurements of individual particle branching ratios 


There were also a number of individual measurements of high import- 
ance that we discuss here in more detail: the mass of the Z° boson, the 
number of neutrinos, and production cross section and decay param- 
eters of the Z°, which are used to obtain the couplings of ) and P to 
compare with the predictions of the electroweak theory. 

The principle of all the measurements at LEP was to take runs at a 
variety of different beam energies around the Z° peak. The experiments 
recorded everything whenever there was a trigger (= an event), and 
these were reconstructed later (offline). They were then classified as ee, 
UH, TT, GG, or luminosity Bhabha events (see Exercise 3.7). It was then 
possible to measure the cross section as a function of CMS energy ys 
and partial cross sections of various types (e.g. according to the decay 
mode, or which direction the particles went). 


8.4.5 The Z line shape 


The cross section as a function of ys (at LEP, twice the beam energy) 
displays a clean peak at the Z resonance. The cross section as a function 
of s for ete > Z° — ff (where f is one of e, u, T, or a quark) is 
given by 


are 
s — M2)? + m2T?, 


ais) = ase) ® (QEDcorr.) ® (QCDeorr.) (8.3) 


where 


_ laa DT s 


0 
o;(s) = 
f M2 T3 


(8.4) 


and the convolutions take account of the higher-order QED and QCD 
corrections. Using the electroweak theory from Chapter 7 to compute 
Ip gives 


_ GpV¥2M3 


T 
f 127 


2 2 
(a?) + (0) | Neolours Q (QCDcorr.) (8.5) 
Equations 8.3 and 8.4 when put together give the usual relativistic Breit- 
Wigner formula. 
Recall that cy and ca are defined for the left- and right-handed parts 


of a particle separately and vary depending on what type of fermion f 
we have (see Table 7.6 and the discussion on page 199): P = I3(f) - 


2Q(f) sin? Ow and c4? = I3(f). 
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8.4.6 Z mass 


The mass of the Z is obtained in principle simply by taking the curve 
of the cross section as a function of CMS energy (as shown in Fig. 8.10) 
and fitting the expression given in eqns (8.3)—(8.5) to find the best value 
for Mz. For reasons we will come to, it is important to measure Mz 
very accurately. In practice, there are various complications to take care 
of, including the QED and QCD corrections indicated in the formulae. 
Another important point is to know exactly what the beam energy is, 
which we describe later. 


8.4.7 The Z width; number of neutrinos 


The Z° can decay into a pair of neutrinos: Z? + vv. How many gener- 
ations of leptons are there? We know of three: e, u, and T. Provided the 
mass of the associated neutrino is less than half the Z° mass, there is a 
way to detect if there are any more. The total width of the Z°, Tz, is 
made up of the sum of the partial widths of all its decay modes: 


Tz E Thad + Leg + Du + Te + NLU, (8.6) 
= Thaa + 307 + NLT, (8.7) 


where lepton universality has been assumed for the second step. The fit 
to the experimental data is sensitive to both the width of the peak and its 
height. The partial widths can be predicted from the electroweak model 
(eqn 8.5) and the branching ratios can be used as a check. Therefore, by 


8.4 Physics at ete” colliders 217 


Fig. 8.10 The ete~ - Z? production 
cross section as a function of CMS 
energy. From [17]. 
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16 Again by fitting the shape of the 
cross section as a function of beam 
energy using eqns 8.3-8.5 with the par- 
tial widths held fixed at their predicted 
values from the standard model. 


Circumference L 


Fig. 8.11 Bending field in the LEP 
accelerator. 


17 This was discovered on a day when 
there was a rail strike! 


measuring the total width I'z,!® the number of neutrinos, N,, can be 
measured. However, since 


1 
To X => (8.8) 
rz 
(eqn 8.4), the greatest sensitivity is obtained by simply measuring the 
cross section at the very top of the peak. The result is 


N, = 2.9841 + 0.0083 (8.9) 


8.4.8 LEP beam energy measurement 


The measurement of the Z mass was a very careful study and involved 
measuring the beam energy accurately. How do you measure the beam 
energy? A technique called resonant depolarization involving the anom- 
alous magnetic moment of the electron (g — 2) was employed (recall the 
experiment to measure accurately the anomalous magnetic moment of 
the muon as one of the most stringent tests of QED). Let us approximate 
LEP by a circle immersed in a uniform vertical B field (see Fig. 8.11), 
which provides the bending. Then 


_ dp 


F=— 
dt 


=-evxB, |p| =eBR= BL (8.10) 
T 

Also, the orbital angular frequency is we = eB/yme. Electrons ‘natur- 

ally’ become polarized over about 5 hours while going around in LEP. 

The polarization can be measured with backscattered light. The spin 

precession angular frequency is given by 


Ws = — h Ey (*)| (8.11) 


so we can compute the number of precessions per turn in LEP, vs: 


Ws — We g-—2 beam g—2 
eS = = 12 
e a a 


The value of (g — 2)/2 is known to an accuracy of 4 x 107° and the 
electron mass m, to a precision of 3 x 1077, so if we can measure vs, we 
get the energy of the beam. 

The technique for measuring vs (resonant depolarization) proceeds by 
adding a small magnet with a field in the x direction (horizontal, trans- 
verse to the beam) that varies as sin vt. When v = v,, the contribution 
will accumulate each turn and cause the beam to depolarize (as measured 
in the backscattered light). The precision obtained in the beam energy 
at LEP was 2MeV (out of 45 GeV). It involved understanding various 
effects, including the tidal pull of the moon (which changes L slightly), 
movements in the water table, and even ground currents caused when 
the fast TGV trains to Paris passed by.!” 


8.4.9 Cross sections and forward—backward 
asymmetries at the Z 
The couplings cy and ca for each fermion type can be extracted from 


measurements of da/dQ and the forward—backward asymmetry App, 
defined as 


Np — Ng 
Apg = —— 8.13 
nR Nr + Ng ( ) 


where Np and Np are the numbers of events with 0 < 90° (forward) and 
8 > 90° (backward), respectively. 18 
Starting from eqn. (7.40), we can show that (see Exercise 8.7) 


do _ G} M5 


= A(1 + cos? 8) + B cos 8 14 
do 3073 | (1 + cos* 6) + Bos 6] (8.14) 

where 
(e)\? (e)\? HN? (ny? (e) (e) (Ff) (Ff) 
A=((d?) +F e) +(&)], B = 8cy CA Cy CX 


Integrating eqn (8.14), we find 
a(ete > ff) x (ee) + (w) (ey + Cou (8.15) 


Integrating again, from eqn (8.14), 


3 acl) ce) 2h) 3 
Are = aT? (O(a aes e 
(4?) +(e) P) + (eX?) 
where 
e) (e f) fF 
I= acke) O ay _ acl) lf) 


2 2? 2 2 
OPS EE) 

What electroweak information can be obtained from these measure- 

ments? o and Apg for a particular final state f give two equations 


involving cf ) and WD, and, in principle, these can be solved to give 


both ff and cf) separately.!? On a plot of cy versus ca, o x CY + cå 
is a circle and oApg x cyca has a hyperbolic dependence. Taken to- 
gether, these should enable the extraction of both cy and ca for the 
fermion f (see Exercise 8.6). This can be done for each final state f. In 
addition, note that o and Arg also depend on the electron values for 
cle) and cf) to be completely unravelled. These must either be taken 
from other experiments or obtained using separate measurements from 
TTT” events. 
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18The angle @ is the angle between 
the incoming electron direction and the 
outgoing lepton or quark, or between 
the incoming positron and outgoing 
antilepton or antiquark. 


19 Measuring o correctly requires accur- 
ate knowledge of the luminosity, which 
is described in Section 8.4.10. 
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Fig. 8.12 Cross section as a function 
of 0, the angle between the incoming 
electron and outgoing lepton. The two 
plots are the results from two different 
LEP detectors; the three curves show 
the three main beam energies at and 
near the Z pole, where LEP was run. 
From [17]. 


Fig. 8.13 App as a function of LEP 
beam energy [17]. 
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We have derived the Apg formula assuming all the interactions are 
mediated by the Z, but we also need to account for the y exchange 
diagram and the interference between the two diagrams. Figure 8.12 
shows the differential cross section as a function of cos@ when LEP was 
run on the Z pole and 2GeV either side. The variation of Arp as a 
function of beam energy is shown in Fig. 8.13. 


In summary, then, from the measured quantities Cee, Typ, Orr, App; 


Afp, and Afp, and some information from 7 decay, we obtain che ) cht ) 


cr ) cf), W, and W, From the data, we see the following: 


0.4 mr 
| — App from fit ALEPH 
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0.2 


Arge) 
© 


88 90 l 92 94 


(1) Universality: che) = ct) = ot and also cf) = cl! = Q to high 


precision. All three types of lepton couple to the Z° with the same 
strength. 


(2) sin? Ow = 0.23189 + 0.00024 (this is the combined value from all 
LEP techniques). 


8.4.10 LEP luminosity measurement 


Accurate data on the LEP luminosity is essential for cross-section meas- 
urements. Recall that the integrated luminosity L is defined by the 
expression N; = Lo; where N; is the number of events from a given 
process į that occur in the detector (once detector effects like trigger 
efficiency and acceptance have been corrected for) and o; is the cross 
section. L is the same for all processes—it depends on the features of 
the accelerator and how well it is working (e.g. how well the two beams 
are steered into each other at a particular interaction region). Luminos- 
ity is measured using a process j with a high rate and for which we know 
how to calculate the cross section. For LEP, the channel j used to meas- 
ure the luminosity was low-angle electron—positron scattering (Bhabha 
scattering), for which the QED single-photon-exchange diagram domin- 
ates. There are only very small contributions to this from diagrams with 
Z? or involving annihilation, and the theoretical uncertainty is below 
0.1%. Bhabha scattering events are measured with small calorimeters 
situated very close to the beam line on each side of the detectors, sev- 
eral metres each side of the interaction point. The events are counted 
when two showers with the appropriate energy are seen within a radius 
region about the beam axis of typically 6cm < R < 15cm. This cut 
is made on one side only to reduce sensitivity to the movement of the 
interaction point. Using the numbers of events N; and N; and the for- 
mulae L = N,/o; and o; = N;/L, we obtain the cross section in which 
we are interested.?° 


8.4.11 Measurements at LEP2, above ys = Mz 


Superconducting cavities were added to the LEP ring from 1995 onwards 
and the accelerator was run at increasing energies as more cavities were 
added (with increasing beam energy, the energy lost by synchrotron 
radiation each turn becomes greater and more cavities are needed to 
replace it). The cross section as a function of beam energy is shown in 
Fig. 8.14. The maximum energy reached was a little over \/s = 200 GeV 
and the last run was in 2000. The main studies at these higher energies 
were searches for new particles: Higgs and SUSY being the most popular. 
Precision measurements of the triple gauge coupling were also made.?! 
The higher energy also facilitated other detailed studies of the W, which 
complemented those done at pp colliders. 

A useful class of events at LEP2 is ‘radiative return to the Z°’ or 
initial-state radiation (ISR). This happens when either the beam electron 
or positron emits a bremsstrahlung photon (Fig. 8.15) and loses enough 
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201 practice, luminosity measure- 
ments will be recorded as a func- 
tion of date and time after correction 
for acceptance and radiative correc- 
tions (higher-order QED processes) and 
made available for all analyses. 


21 These studies required the CMS en- 
ergy to be above 2My. 
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Fig. 8.14 Cross section versus ys near 
and above the Z pole. From [17]. 


et 


Fig. 8.15 One Feynman diagram for 
initial-state radiation (ISR). 


22 Jet-finding algorithms are discussed 
in Section 13.3. A nice example of a 
two-jet event is shown in Fig. 8.6(d). 
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energy for the subsequent collision to have the right energy to make an 
on-mass-shell Z°. This is very significant, because the Z° resonance is 
so large. Measuring cross sections and Arp above the Z peak provides 
more consistency checks of the electroweak model. 


8.4.12 WtW7 production 


This is a good example for studying some of the important aspects of 
a physics analysis either at LEP or at a hadron collider. The two main 
aspects are (1) choice of final states and (2) combinatorics. WtW- 
events can come in three different types depending on whether each of 
the Ws decays to a pair of quarks or to a charged lepton and a neutrino. 
For events with quarks in the final state, we need to run a ‘jet’ algorithm 
to assign measured charged particle tracks and calorimeter energies not 
associated with tracks to a particular jet.?? 


e A jets: We get the 3-momentum components of each of 4 jets and 
we know the beam energy (a total of 13 numbers), so we can do 
a constrained fit of the event. The things we do not know are 
the two decay angles of the Z and of each of the two Ws. We 
also leave the masses of the two Ws free in the fit. It is there- 
fore a 5-constraint fit (13 numbers, 8 unknowns). There is also 
a combinatorial problem—we do not know which pairs of jets go 
together—so we try each of the three combinations in turn: 12, 34; 
18, 24; 14, 23. For all three combinations, we make a scatter plot 
of the mass of one W versus the mass of the other and hope to 
see a peak where the real W is. Intuition tells us that the false 
combinations are likely to be scattered widely about the plot, not 
forming a mass peak, and a Monte Carlo simulation can be used 
to provide more quantitative information about the distribution of 
the false combinations. 
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e 2 jets, 1,v: There are fewer constraints here, because the neutrino 
is undetected and deprives us of 3 of the 13 numbers we had in the 
4-jet case. Nevertheless, this is a 2-constraint fit and is therefore 
fairly powerful. There is no combinatorial problem here—the de- 
tector tells us which is the lepton. This analysis is only done when 
the lepton is an electron or a muon. If the lepton is a 7, the 7 de- 
cay must involve at least one other neutrino, a 7 neutrino. There 
is still sufficient information for a 1-constraint fit. 


An example of a W mass fit from the pv,,qq channel [5] is shown in 
Fig. 8.16. This technique is a good way to measure the mass of the W. 
Each individual event gives two independent measurements of the mass 
and all we need do is take an average to get the mass. If we want to be 
more sophisticated, we can estimate the error on each mass measurement 
(from the errors on the track and calorimeter measurements) and do 
a weighted average. The spread of the individual My measurements 
gives us the opportunity to measure the width of the W (the spread 
is determined by the natural width and the experimental resolution). 
The result is Tw = 2.48 + 0.41 GeV. The prediction from electroweak 
theory is 2.077 GeV, which is consistent. This is not as precise, however, 
as the measurements from the Tevatron (the CDF and DO experiments) 
discussed in Section 8.5.3. 


8.4.13 o(ete- > WtW-) 


The three diagrams contributing to the ete~ —> WtW- cross section 
are shown in Fig. 8.17. All three are needed to give the cancellation 
required to avoid a divergent theoretical result for this cross section as 
the energy increases. It is therefore very satisfying to see that the meas- 
ured values agree with the theory when all three diagrams are present 
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Fig. 8.16 W mass distribution from 
the 2-constraint fit for W > pvp, 
W + qq. Similar distributions are ob- 
tained for the qqqq, evqq, and tvqq 
channels. From [5]. 
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Fig. 8.17 Three lowest-order 
Feynman diagrams for W-pair 
production: (a) t-channel ve exchange; 
(b) s-channel y exchange; 

(c) s-channel Z exchange. 
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Fig. 8.18 The ys energy dependence 
of o(Z > WW) near threshold, from 
which the W mass may be 
determined. The lower curve, which 
follows the data, shows the predicted 
cross section including all three of the 
diagrams shown in Fig. 8.17, and the 
other curves show the prediction if 
some of these diagrams are omitted. 
From [16]. 


23The W was discovered before the Z 
because the product of cross section 
and branching ratio is an order of mag- 
nitude bigger than that for the Z. 


24 The fact that the W and Z signatures 
were so clean was a major turning point 
in the subject. Previous prejudice was 
that hadron colliders were too ‘dirty’. 
This change of attitude opened the way 
to the construction of the LHC. 


but completely disagree with a calculation neglecting the triple-gauge- 
boson coupling. This provides conclusive evidence for the existence of 
the triple-gauge-boson couplings YWtW- and ZWW, with the rates 
agreeing with those predicted by the unified electroweak theory. 

Also, the threshold energy for WtW- pair production gives a separate 
measurement of the mass of the W. The shape of o(Z => WTW7) as 
a function of ys just above threshold is shown in Fig. 8.18. The result 
for the mass of the W from LEP (combining the cross-section threshold 
measurement and the individual event reconstruction method described 
above) is My = 80.39 + 0.09 GeV. The Tevatron experiments provide a 
better measurement, as will be discussed in Section 8.5.2. 


8.5 


8.5.1 


The W and Z were discovered in 1983 at the CERN pp collider. The 
accelerator physics required to produce sufficiently intense p beams is 
reviewed in Chapter 3. The largest branching ratios for W and Z are to 
hadronic final states, but these are very difficult to study in a hadron 
collider because of the very large cross section for QCD jets (see Chap- 
ter 9). Therefore, the experiments focused on the relatively clean leptonic 
decay modes: W - eve, W > uvu, and Z > ete~, Z > ptu.” 
The key detector feature that enabled the W discovery was the use of 
‘hermetic’ detectors, which allowed the neutrino transverse momentum 
to be determined from the measured missing transverse energy (Eiss, 
see Chapter 4). As the mass of the W was predicted to be very large 
(~80 GeV), the signature was a high-transverse-momentum lepton (elec- 
tron or muon) and a high value for Æ®'ss, While there are backgrounds 
from QCD that can produce ‘fake’ electrons or muons in the detector, 
they would not usually have such a large energy, nor would a QCD event 
produce such a large value of E™'8s, The signature for a Z is an ete~ or 
pt” pair with invariant mass consistent with that of the Z, resulting 
in a narrow resonance on top of a low background. The measured W and 
Z masses were in very good agreement with the predictions from elec- 
troweak theory.?4 Much higher-precision measurements of the Z mass 
were made at LEP, as discussed in Section 8.4. 


W and Z physics at hadron colliders 


W and Z discovery 


Test of V—A 


We will need to assume that the W decays are via a V—A coupling in 
order to be able to use their distinctive signature quantitatively. The 
most direct and in some sense simplest test of the V—A theory of weak 
decays is provided by measuring the angular distribution of the charged 
leptons resulting from W decay. We can calculate the CMS angular 
distribution, as will be discussed shortly. As the Ws are produced with 
finite longitudinal momentum, we need to boost the measured event to 
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Q cos 6* tion in W — eve decays [15]. 


Fig. 8.19 Measured angular distribu- 


the qq CMS. We do not directly determine the longitudinal momentum 
of the neutrino, but it can be determined up to a quadratic ambiguity by 
using the W mass constraint. The measured distribution [15] is shown 
in Fig. 8.19. 


8.5.2 W mass determination at the Tevatron 


The current best measurement of the W mass comes from the Tevat- 
ron pp collider, which had a CMS energy of 1.96 TeV. The mass of the 
Z was measured very precisely at LEP. Combining precision measure- 
ments of the W and Z bosons is interesting because the result provides 
a window on possible higher-mass particles via radiative corrections (see 
Section 7.4.3). 

The decay modes W — lv with l = e or l = wu are used for the W 
mass determination, since they have very clean signals. The neutrino is 
not directly detected; however, the transverse component of the neutrino 
momentum can be determined from the missing transverse momentum in 
the event. Too much energy is lost in the beam pipes for the longitudinal 
component to be determined. Also, the quarks carry unknown fractions 
x and 7 of the proton and antiproton momenta, and hence the invari- 
ant mass of the W cannot be computed from the final-state particles 
in an individual event. However, the mass of the W can be determined 
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Fig. 8.20 Spin structure in 
qq > W- > ei events. 


>The peak arises from the Jacobian 
of the change of variables and is often 
called the ‘Jacobian peak’. 


26 At collider energies, the masses of 
the leptons (e and p) are negligible 
compared with their momenta, and 
therefore we can use the massless ap- 
proximation p = E. 


from the data on a statistical basis. The W bosons are produced via 
the parity-violating V—A interaction. We will assume that the quarks 
(antiquarks) come from the p (p) because the typical values of the pro- 
ton momentum fraction x (antiproton momentum fraction 7) are quite 
large. All the quarks and leptons are ultrarelativistic, so the V—A inter- 
action, which couples left-handed particles (right-handed antiparticles), 
will result in negative-helicity particles (positive-helicity antiparticles). 
The spin structure in W production and decay is shown in Fig. 8.20. 

The W~ has a spin along the proton beam direction of J, = —1 
and the /~ D, system has J,, = —1 along an axis pointing in the direction 
of the /~. The angular distribution in the W CMS is then given simply 
by the rotation matrix (see Chapter 2): 


dN 
d cos 6* 


N, 
= No[d}; (cos 6*)]? = an + cos 6*)? (8.17) 


where No is a normalization constant. From the chain rule, we can 
change variable to the component of the momentum of the electron 
perpendicular to the beam, pî: 


dN dN _ dcosé* 
dp  dcosé* dp$ 


2p$ 2 
* ; *\2 _ TẸ 3 
cos ĝ* = y1 — (sin 0*)} = 4/1 — M (8.19) 


Differentiating gives 


(8.18) 


with 


coe T 4p$/ M3; - (8.20) 
PT u (2t 
Mw 
Substituting from eqn 8.20 into eqn 8.18 gives?” 
572 
2 e 
144/1- (2 
an A Mw 
É |= N - (8.21) 
Pr Ww = 2ps. 
Mw 


From this expression, it can be seen that the pr spectrum will be peaked 
towards its upper endpoint at Mw /2. In principle, the W mass can be 
determined by fitting the measured charged-lepton pr distribution to 
eqn 8.21.?6 However, the shape of the distribution is distorted by the 
distribution of the transverse momentum of the W, py . Therefore, it 
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is convenient to define the transverse mass Mr in a similar way to the 
invariant mass but only considering transverse components: 


Me = (pir + Dvr)” [Pir t Purl? (8.22) 
This can also be expressed as (see Exercise 8.5) 
M2. = 2php% (1 — cos Ad) (8.23) 


where Ad is the angle between pł and p4, the transverse momenta of 
the lepton and neutrino, respectively. 


Experimental aspects of measuring Mw from the Mr 
distribution 


For W bosons produced with no transverse momentum, Mr = 2pł and 
to lowest order the effect of non-zero values of PW will not change the 
value of Mr; hence there should be an endpoint at Mw. The distribution 
is smeared by the finite width of the W and by experimental resolution. 
However, if these effects are taken into account, the data can be used in 
a fit to determine Mw. There are also several systematic uncertainties 
that must be understood before a precision measurement of Mw can be 
made. 

One of the most important of these is the energy scale for electrons 
and muons; this can be constrained by fitting the /*/~ invariant mass 
(where lis an e or a u) to the peaks from Z, Y, and J/w decays. Using 
the known masses of these resonances, the energy scale can be calibrated, 
allowing for any nonlinearity. As Mz > Mw, it is essential to fix the 
nonlinearity as well as the overall energy scale of the detector. 

Another key systematic uncertainty is the measurement of p% by the 
method of missing transverse momentum. This can be constrained by 
studying the apparent missing transverse energy in Z — I*I~ decays 
and looking at the agreement between the values of p% inferred from 
the accurately measured /*/~ system and the hadronic recoil. The axes 
perpendicular and parallel to the bisector of the /+/~ system are defined 
as shown in Fig. 8.21. For a perfect detector, the transverse momentum is a Ua E EE TE E 
of the I*I~ system would be balanced by the hadronic recoil u. The +17 Sal and eetpendicular tothe 
system is measured with far better precision than that of the hadronic bisector of the [+/~ system. 
recoil. Therefore, the distribution of p$ + u can be used to determine 
the hadronic response. An example of such a plot from the CDF experi- 
ment [1] is shown in Fig. 8.22, which shows the mean value of pH + un 


as a function of pł. The quantity is projected onto the 7 axis because 
the experimental error in the value of p} is mainly due to errors in the 
measurements of angles, rather than energy. Measurements of angles 
from the tracking detector are very precise, so the measurement errors 
in this quantity are greatly reduced. The value of this quantity would 
be 0 for an ideal detector and the fact that the mean value is positive is 
due to energy loss outside the acceptance of the detector and in cracks 
between calorimeter cells. 
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Fig. 8.22 Calibration of the hadronic 
response by measuring the mean value 
of p} + Un versus pik as measured by 
CDF in Z + ete” decays [1]. 


Fig. 8.23 W mass fit at CDF in the 
W — eve channel [3]. 
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The Mr distribution for W — pv, shows a very clean Jacobian peak 
(see Fig. 8.23). From the fits to the Mr spectra from W —> eve and 
W — uvu, the W mass is determined to be 80387 + 19 MeV, which 
is the most precise measurement to date [3]. The current (2014) world 
average value [115] is 80385 + 15 MeV. 


8.5.3 Width of the W 


The total width of the W was measured at CDF and DO using two 
methods to compare it with the value predicted by the electroweak the- 
ory of [Tw = 2.077GeV. The direct technique is to extract it from 
the same transverse mass distribution as described above. The distri- 
bution will be modified depending on the W width. The value obtained 
is Tw = 2.11 + 0.32 GeV. 


The indirect method of obtaining the total W width involves using 
some information from LEP. The width is obtained by measuring the 
production cross-section ratio?” 


o(pp+W- lv) o(pp>W) z Tr(W > lv) 1 


T =- X 
o(pp > Z > ll) o(pp> Z) BR(Z>l) Tw 


(8.24) 


The other numbers in the formula are obtained as follows: the ratio 
o(po + W)/o(pp —> Z) is predicted by modelling; T(W — lv) 
is predicted from the electroweak theory; BR(Z — ll) was measured 
at LEP. 

The result obtained is Tw = 2.062+0.059 GeV. The importance of this 
measurement is that the total W width is sensitive to any particles into 
which the W might decay, including some that we might not otherwise 
have discovered yet. This indirect method produces a much more precise 
check than either the direct method or the measurements from LEP. 
Everything is consistent with the prediction from the electroweak theory. 


8.6 Top-quark physics 


The top quark was expected ever since the b quark was discovered in 
1977. From the measurements at LEP of Z — bb, the b quark was 
confirmed to be a member of a weak isodoublet. While these results gave 
strong indications that the top quark must exist, it was the combination 
of the precise electroweak data with the radiative corrections that gave a 
prediction for the top-quark mass of ~170 GeV (see Chapter 7). We will 
review the discovery of the top quark in Section 8.6.1 and then consider 
how to make precision measurements of its mass in Section 8.6.2. 


8.6.1 Top-quark discovery 


Given the high mass, the only accelerator at the time capable of pro- 
ducing the top quark was the Tevatron. The largest cross sections for 
producing top quarks at the Tevatron are pair production from either 
gg or qq initial states (see Figure 8.24) 

The top quark decays very rapidly, with a branching ratio of essentially 
100% into a W and a b quark. Since the mass of the top is much larger 
than that of the W, the W from top decay is on mass shell and it decays 
through all its decay modes with the known branching ratios. Therefore, 
it is easy to work out the fraction of events with different final-state 
topologies. All events will have two b-jets. The decays of the two Ws 
leading to final states that can be identified above the backgrounds are 
listed in Table 8.2. 

The main Standard Model background is W + jets from higher-order 
QCD corrections to W production. In general, this type of background 
event will not contain b quarks. Therefore, identifying jets contain- 
ing b quarks is a critical technique for separating the top signal from 
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27 Measuring a ratio of cross sections is 
easier than measuring a correctly nor- 
malized single cross section, since many 
systematic effects will cancel out. 


28 Again a ratio is easier to predict. 


g t 
g 

g t 

q t 
g 

q t 


Fig. 8.24 Lowest-order Feynman 
diagrams for tt production. 


Final state Fraction 
ep 1/81 
ete or php 2/81 
e-jets or p-jets 12/81 
All jets 36/81 


Table 8.2 Some tt final-state 
topologies. 
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backgrounds. There are two strategies for identifying b quarks (called 
‘b-tagging’): 


e Use semileptonic decays of the b quark to search for leptons (in 
practice only e or u). Because of the relatively high mass of the b 
quark, these leptons will tend to have high transverse momentum 
with respect to the jet axis. 


e Use the relatively long lifetime of the b quark. The b quark frag- 
ments into B mesons or baryons before it decays. The lifetime is of 
the order of a picosecond, so, allowing for time dilation, the decay 
vertices of B hadrons can be displaced by distances of the order of 
millimetres. 


The first technique is limited by the semileptonic branching ratio, so life- 
time tagging is more powerful. This requires very high-precision tracking 
close to the interaction point. As B hadrons will typically decay inside 
the beam pipe, very high-precision detectors of very low mass to minim- 
ize multiple scattering are required. The only practical technique that 
can satisfy both requirements is based on the use of silicon detectors (see 
Chapter 4). Starting from the interaction point, the beam pipe within the 
vertex detector is made from beryllium (Z = 4), since this has a very long 
radiation length (which minimizes distortion of charged-particle tracks 
due to multiple scattering). The active detector components consist of 
layers of silicon strip detectors around the beam pipe. The first layer is 
mounted on the beam pipe to minimize the distance to the beam line. 
This reduces extrapolation errors and the effect of multiple scattering 
in the beam pipe. The transverse impact parameter is defined as the 
closest approach of an extrapolated track to the primary vertex location 
in the plane transverse to the beam line. Short-lived hadrons from light- 
quark jets produce a Gaussian distribution and the long-lived B hadrons 
generate an exponential tail at high values of the impact parameter. 

The power of b-tagging to identify the top-quark signal is illustrated 
in Fig. 8.25 [7], which shows the number of events with Ws and various 
numbers of high-transverse-momentum jets before and after b-tagging, 
The signal events from tt events should ideally contain 4 jets, but this can 
be distorted by instrumental effects. The W + jet events will decrease 
rapidly with increasing number of jets, because each additional jet has 
a penalty of the order of as. A very clear signal is visible for 4 jets after 
b-tagging, as shown in Fig. 8.25. 


8.6.2 Top-quark mass measurement 


Once a clean signal for tt has been established, these events can be used 
to fit the top-quark mass m+ by reconstructing the decay products of 
the t and t. This procedure is difficult, since there is no way of uniquely 
identifying which jet came from the t and which from the t. Many pos- 
sible combinations have to be tried in turn and some algorithm used to 
select the most likely combination. In the ‘all-jets’ channel, all the decay 
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products of the t and ¢ are measured directly, but this channel suffers 
from a large background. 

The semileptonic channels are cleaner, but in a hadron collider only 
the momentum of the neutrino transverse to the beam direction can 
be reconstructed. The unknown longitudinal momentum of the neutrino 
can be determined by exploiting the constraint from the decay W —> lv: 


(Mw)? = (E, + E)? — (py + pi)? (8.25) 


The neutrino momentum is split into a transverse component p,,r, which 
can be inferred from the measured missing transverse energy, and an 
unknown longitudinal component p, 1. This gives 


2 
(Mw)? = VA PZT E) (put + pir)? — (puu + pi)? (8-26) 


Using the known mass of the W, the value of p, can be determined 
up to a twofold ambiguity by solving the quadratic equation 8.26. An 
example of a fit to the top mass is shown in Fig. 8.26 [2]. 

The combined result of the top-quark mass measurements from the 
Tevatron CDF and DO experiments is m = 173.20 + 0.51 (statistical) + 
0.71(systematic) GeV [131]. The top-quark measurements at LHC will 
be reviewed in Chapter 13. 


8.6.3 Top-quark production cross sections 


As well as measuring the mass of the top quark, another key meas- 
urement is the top-quark production cross section. The Tevatron 
collider measurements [115] for the total top production cross section 
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Fig. 8.25 Distribution of the number 
of jets before and after b-tagging as 
measured by CDF. The circles (tri- 
angles) are the data before (after) b- 
tagging and the shaded boxes repre- 
sent the background estimates after 
b-tagging. From [7]. 
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Fig. 8.26 Top-quark mass fits from the CDF experiment. (a) Reconstructed dijet mass, which shows the expected peak at 
My. (b) Distribution of reconstructed top-quark masses. From [2]. 


Fig. 8.27 The mass of the W versus 
the mass of the top quark, with results 
from direct measurements and indirect 
determinations from Standard Model 
fits. The region below my < 114GeV 
is excluded by direct searches at LEP. 
The small ellipse shows the Standard 
Model fit to all the precision data, 
including the direct measurements of 
the W and top-quark masses, whereas 
the larger dashed ellipse includes only 
the precision data collected near the 
Z pole. The shaded diagonal bands 
show the Standard Model expectations 
for different ranges of the Higgs boson 
mass. From [97]. 


Otot (pp —> tt) in pp collisions at V/s = 1.96 TeV) are oD? = 7.561038 
(statistical + systematic) pb and o£? = 7.50 + 0.48 (statistical + sys- 
tematic) pb. The methodology for calculating cross sections for Standard 


Model processes will be described in Chapter 9. 


8.7 Summary 


All electroweak data are consistent with the Standard Model at a pre- 
cision sufficient to be sensitive to radiative corrections. This enables us 
to place limits on masses and couplings of new particles. For example, 
using the measured masses of the top quark and the W boson within 
the context of the Standard Model, we can predict a value for the mass 
of the Higgs boson. The results are shown in Fig. 8.27. Combining this 
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analysis with the results of the direct Higgs search at LEP (see Chap- 
ter 12), the 95% confidence level allowed range [97] for the Higgs mass 
is 114GeV < my < 149GeV. The discovery of a Higgs boson in this 


mass range is discussed in Chapter 12. 


Chapter summary 


e Experiments confirmed the V—A structure for the charged-current weak 


interactions. 


e Neutral currents were detected, which was the first success for the unified 


electroweak theory. 


e The W and Z bosons were discovered at the expected masses. 


The electroweak theory was tested with very high precision at LEP 
running on the Z boson. 


Many precision measurements allowed the Standard Model to be probed 
at the level of the radiative corrections, which allowed the successful 
prediction of the top-quark mass. 


The measured top-quark mass combined with the other precision data 


allowed narrowing of the allowed range for a Standard Model Higgs 


Further reading 


e Cahn, R. N. and G. Goldhaber, G. (2009). The Ex- 
perimental Foundations of Particle Physics (2nd edn). 
Cambridge University Press. Chapters 6 and 12-15 
give a summary and reprints of key papers on weak 
interactions and electroweak unification. 


e Particle Data Group (2014). Review of Particle Phys- 
ics. Chin. Phys. C, 38, 090001. In the section ‘Standard 


Model and related topics’, the review article ‘Electro- 
weak model and constraints on new physics’ gives a 
full discussion of the Standard Model fits to all the 
precision electroweak data. 

Behnke, O. et al. (Eds.) (2013). Data Analysis in High 


Energy Physics. Wiley. A good book on advanced data 
analysis techniques. 


Exercises 


(8.1) Consider the reaction used in the determination 
of the neutrino helicity: *?Eu(J = 0) +e7 > 
1529m*(J = 1) + ve, followed by 1°*Sm*(J = 1) > 
152Sm(J = 0) + y. The energy released in the 
electron capture of the ?Eu is 840 keV. 


(a) Show that the y emitted in '°?Sm* decays in 
which the nucleus is at rest have too low en- 
ergy to undergo the inverse reaction y+ °?Sm 
(J=0)+*°?Sm*(J = 1). The width of the 
resonance at 960 keV in '°*Sm* is about 3eV. 
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(8.2) 


(8.3) 


(8.4) 


(8.5) 
(8.6) 


(8.7) 


(8.8) 


(b) Show that if the y is emitted in the forward 
direction with respect to the recoiling Sm*, 
then it will have sufficient energy to undergo 
y+ %?Sm(J = 0) > ©?Sm*(J = 1). 
Consider the 4 combinations of the spins of 
the electron and the neutrino to show that for 
the allowed cases, the helicity of the photon is 
equal to that of the neutrino. 

Hint: The spin of the photon projected on 
its direction of propagation can only be +1 
or —1. 

Justify eqn 8.1. 

Hint: first consider the allowed couplings of L and 
R leptons and use the Standard Model relation for 
the L and R coupling constants in terms of the 
weak mixing angle (see eqn 7.40). 

Justify the condition for the electron angles and 
energies given by eqn 8.2. 

If the neutrinos cannot be directly detected, how 
can one detect Z — vi events in ete? Draw 
an appropriate Feynman diagram and discuss the 
resulting detector requirements. 

Hint: Consider the effects of initial-state radiation 
(ISR). 

Show that eqn 8.23 can be derived from eqn 8.22. 


Calculate the forward—backward asymmetry A$$ 
and the total cross-section for the reactions 


(a) ete” > pt u 

(b) ete” — bb 

Determine the angular distribution given 

eqn 8.14. 

The cross section for inverse beta decay is given by 
2 

_ AGE P 


T 


in 


>e' +n) 


o(ve +p 


(8.9) 


(8.10) 


(8.11) 


Derive this cross section from Fermi’s Golden Rule, 
ignoring spin and assuming that the matrix elem- 
ent Mig = 2Gr. In a fission reactor, the energy 
released per fission is ~200 MeV and there are 6 
antineutrinos released. The average energy of the 
anti-neutrinos is approximately 2 MeV. What is 
the minimal power of a nuclear fission reactor to 
have three interactions per hour in a 200kg water 
tank at 10m distance 


At a given neutrino energy, the ratio of antineut- 
rino to neutrino cross-sections for scattering on 
electrons is 


o(Du)/o(vu) = 0.85 


Deduce a value for sin? Ow. 


In an experimental run at LEP at the Z° peak en- 
ergy, the integrated luminosity was 23.955 pb~?. 
After corrections for background and detector ef- 
ficiencies, a total of 993797 hadronic events and 
47 838 ut ~ events were obtained. Stating your 
assumptions, calculate the number of neutrino 
flavours and comment on your result. 


The cross section for ete” > WTtW~ just above 
threshold is given by 


_ 2GeMw | i 4M2; 
= TS s 


where Gr is the Fermi coupling constant, Mw the 
W mass, and s the CMS energy squared, all in nat- 
ural units. Express ør in terms of s and 8. With e* 
and e~ beams of energy 80.65 GeV, a cross section 
for all-hadronic decay modes of 1.7 pb is obtained. 
Hence calculate a value for the W mass. 


OT 


Dynamic quarks 


In this chapter, we consider the structure of hadrons and discuss the 
most direct dynamical evidence that hadrons are made from quarks. This 
evidence comes from the scattering of high-energy leptons off protons 
and neutrons. The relatively high cross section for reactions at large 
transverse momentum, so-called deep inelastic scattering (DIS), leads to 
the conclusion that these hadrons are made up of point-like constituents. 
To observe the structure within hadrons, a resolution smaller than the 
size of the hadron is required; hence the wavelength of the probe should 
satisfy A << R (where R is the radius of a hadron). From the de Broglie 
relationship! \ = h/p, therefore, high-energy experiments are required 
to study this structure. Using the uncertainty relation in a form more 
suitable for high-energy physics, AxApc < fc, implies that we require 
an energy transfer of the order of at least 20 GeV to achieve a resolution 
of 107? fm.? 

Other evidence that hadrons are made of quarks was discussed in 
Chapter 5, where the static quark model of hadrons was used to ex- 
plain the observed multiplets of hadrons and their masses and magnetic 
moments. In Section 9.1, we will consider scattering off a nucleus and 
in Section 9.2 scattering off individual nucleons. Then we will discuiss 
how the DIS data can be explained in terms of the quark—parton model 
(QPM). The DIS data give indirect evidence for the existence of glu- 
ons, and we will consider more direct evidence for gluons from ete7 
experiments. A brief discussion of QCD will be given, and this will be 
used to explain the success of the naive QPM. Finally, the QPM will be 
extended to hadron-hadron collisions. 


9.1 Rutherford scattering 


The prototype for all scattering experiments is the Rutherford scattering 
experiment that discovered the atomic nucleus. The experiment involved 
scattering a particles off a gold foil and measuring the angular distribu- 
tion of the scattered particles. We can calculate the transition rate from 
Fermi’s Golden Rule, which gives 


wei = 27| (Hli)? oE) (9.1) 


where H’ is the Hamiltonian for the perturbation that causes the 
transition between initial state i and final state f and p(E) is the density- 
of-states factor. In this case, the perturbation is given by the Coulomb 
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interaction between the a particle (atomic number Z1) and the nucleus 
(atomic number Z2). If all the positive charge is contained in a point-like 
nucleus, we can write the potential as 


= ZZQ 
o T 


V(r) (9.2) 
where a = e? /(4reo) is the fine-structure constant. The matrix element 
is then given by the integral of V(r) between the initial and final states. 
We can use plane waves for the initial and final states of the a particles: 
W; = exp(ik; -r) and Yẹ = exp(iky- r). This gives 


(f|H’|i) = J exp(ik; - r)V (r) exp(—ikg- r) d°r (9.3) 


Substituting for V(r) from eqn 9.2 into eqn 9.3 and then using q = k;—kr 
for the momentum transfer gives 


iq-r 


d?r (9.4) 


(f|H’ |i) = 22a |: 


It is convenient to use spherical polar coordinates and to take the z 
axis of r to lie along q so that q- r = qr cos 0. Then we can perform the 
integral over ¢@ and 9: 


(f|H'|i) = Z1 Z227 / eta" cos 07 dr d cos 0 (9.5) 
ZiZ22 ee 
2 ag f (ir — ela") dy (9.6) 


This integral is divergent, which is related to the fact that the scattering 
cross section for scattering of two particles via an inverse-square force 
law is infinite. We will proceed by modifying the potential by a factor 
e—"/@ and at the end of the calculation we will let a > oo. Then we can 
perform the integral in eqn 9.6 to give 


27a 1 1 
fH’) = ZZ 9.7 
vey aie iq (oo Ta) on) 


and then, if we let a > oo, we obtain the matrix element 


. 4r Z Za 
Cf) = — (9.8) 
q 

The density of states for a unit volume V (this volume is arbitrary 
and will cancel with the flux in the calculation of the cross section) is 


given by (see Chapter 2) 


d? pr 
(27)8 


dN = (9.9) 


We make a change of variable using 


dp il 
— = — 9.10 
dE UE ( ) 
where vp is the final velocity. As we are assuming an infinitely et nu- 
cleus, we can neglect the nuclear recoil energy. Writing d? pp = p? dps dQ 
and substituting into eqn 9.9 gives 


(9.11) 


The differential scattering cross section is given by R/v, where v is the 
velocity of the incident beam relative to the scattering centre and R is 
the reaction rate.’ 

Substituting from eqns 9.8, 9.9, and 9.11 into eqn 9.1 gives the 
differential cross section as (see Exercise 9.1) 


do _ 4(Z1 Z2}? (me)? 
dQ qs 


(9.12) 


This is an example of a scaling cross section, since it does not depend on 
any fixed scale in the problem. This arises because we have assumed that 
the nucleus is a point charge. If instead we allow for a charge density 
distribution p(r), then the potential becomes 


p(R) 43 
af r- R] R (9.13) 


Then the matrix element is modified from eqn 9.8 to become 


(f|H’|i) = Za I ËR / gar A dèr (9.14) 


Letting s = r — R, we can then write the matrix element as 
(f|H"|i) = Za Jf ia Ry Rare — ds (9.15) 


Hence the matrix element is modified by a multiplicative factor called 
the ‘form factor’ 


F(@) = | ctR o(R) dR (9.16) 


The form factor can be seen to be the Fourier transform of the charge 
distribution into momentum space. By measuring the deviations from 
the pure Rutherford scattering cross section, eqn 9.12, the form factor 
can be determined. Hence the mean size and charge density distribu- 
tion of the nucleus can be inferred. However, to obtain any meaningful 
data on the nuclear size, one must have data with momentum transfer 
large enough to satisfy the inequality qRnucleus > 1, where Rnucleus iS 
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3We have set the normalization volume 
to be unity, and therefore the incident 
flux is 1/v, and we have assumed a 
single target nucleus. 
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Fig. 9.1 Cross section for the scatter- 
ing of 153 MeV electrons off gold nuclei, 
with fits to the nuclear charge distribu- 
tion for a sharp edge (A) and a realistic 
rounded edge (B). From [53]. 


the mean size of the nucleus. The cross section no longer shows scal- 
ing behaviour, because of the effect of the finite size of the nucleus. For 
example, if the nucleus had an exponential charge distribution 


pr) = poe ™™” (9.17) 
then the form factor would be given by 


F(q’) = a 2 (9.18) 


Note that for q < m, F(q?) is a constant, whereas for q > m, 
F(q?) ~ q-“, ie. the cross section is suppressed by a factor of 1/q°. 
This feature that the cross section is suppressed for values of q large 
compared with 1/R, where R is the size of the nucleus, is a general 
result and will be true whatever the precise form of the charge distribu- 
tion. An example of some real data [53] and a fit to the charge density 
distribution are shown in Fig. 9.1. 
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9.2 Scattering from nucleons 


In the previous section, we have seen how deviations from Rutherford 
scattering can be used to measure the charge distribution of the nucleus. 
Now we will consider the problem of how to look for evidence of indi- 
vidual nucleons (protons and neutrons) in a nucleus. The answer is to 
do scattering experiments where we look for evidence of inelastic scat- 
tering off the whole nucleus and elastic scattering off a single nucleon. 
The kinematics are defined in Fig. 9.2. 

The invariant mass W of the recoiling nucleon of mass M is given by 


W? = (E*)? — |p*/? (9.19) 


and the 4-momentum transfer q is found from 4-momentum conservation 
at the lower vertex as q = (E* — M,p*). Therefore, we can write down 
the square of the 4-momentum transfer: 


q? = (E* — M} — |p*/? 
= (B*)? — |p*|? — 2M E* + M? (9.20) 


Substituting for W from eqn 9.19 gives 
@ = W° —-2ME* + M? (9.21) 


From energy conservation, v = E — E’ = E* — M and therefore, from 
eqn 9.21, 


@ = W? — M? —2Mv (9.22) 


For elastic scattering off the entire nucleus, the mass of the nucleus is 
unchanged in the collision, and hence W = M, so we get a peak in the 
change in the electron energy given by 
a 
2Mnucleus 


u = 


(9.23) 


Similarly if we had scattering off a single nucleon, then we would expect 


g 


—-—_ 24 
2Mnucleon (9 ) 


y = 


Some example data [87] are shown in Fig. 9.3, where the cross section 
for electron scattering off helium is plotted as a function of the scattered 
electron energy E’. A sharp peak is seen at the value expected for elastic 
scattering off the entire nucleus and a smeared peak is seen for elastic 
scattering off individual nucleons. The smearing is caused by the Fermi 
motion of the nucleons within the nucleus. Naively, we might hope that 
we could see evidence for the quark structure of the nucleons in a similar 
way, but now the Fermi motion is so big that it completely smears out 
the peak corresponding to elastic scattering off a quark. 
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Fig. 9.2 Kinematics for scattering off 
individual nucleons. We assume that 
the nucleus is initially at rest in the 

laboratory frame of reference. 
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Fig. 9.3 Scattering of 400 MeV elec- 
trons off He, with the scattering angle 
fixed at 0 = 45°. E’ is the energy of the 
scattered electron [87]. 
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Fig. 9.4 Kinematics for quark—parton 
scattering. 
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9.3 Quark—parton model 


This section introduces the quark—parton model (QPM), which will be 
used in the following sections to make predictions for deep inelastic 
scattering processes. Our approach is not historically accurate—it took 
physicists a long time to take the experimental data seriously enough to 
really believe in quarks. However, it is much easier to understand. We 
will compare the predictions with the experimental data and show that 
the key features are confirmed by the data, namely that the nucleons 
contain spin-5, point-like particles with the fractional electric charges 
expected in the quark model. 

In the QPM, we assume that the inelastic scattering of a lepton at 
large momentum transfer q with a nucleon is due to the elastic scattering 
off a quark via exchange of a virtual boson (7, W, or Z). First, we will 
review the kinematics of the reaction in Section 9.3.1; then in Sections 9.4 
and 9.5, we develop the dynamics to allow us to predict the form of the 
differential cross sections. We then compare QPM predictions with the 
experimental data and discuss many internal consistency checks that 
give us confidence in the model. 


9.3.1 Kinematics of deep inelastic scattering 


The key idea of the QPM is shown in Fig. 9.4, which also defines the 
kinematics. We assume that the quarks each have a mass that is a frac- 
tion x of the nucleon mass; i.e. the quark mass m = xM. The energy 
of the struck quark after the scattering is E* = v + xM, where v is 
energy transferred by the virtual photon. In terms of the incoming and 


outgoing electron energies, v = E — E’, and the 3-momentum transfer is 
p* = q. Then, using E? — p? = m? for the struck quark after the elastic 
collision, we get 


(v +M}? — |q|? = (zM)? (9.25) 


Multiplying this out gives 


v? +2rMv + (2M)? — |q|? = (2M)? (9.26) 
Defining Q? = —q? = —(v? — |q|*), we solve for the parton mass 
fraction x as 
Q? 
= 2 
© = say (9.27) 


Note that x can be determined purely from a measurement of the scat- 
tered lepton momentum. This is important from an experimental point 
of view, since it is generally easier to make precise measurements of the 
momenta, of electrons and muons than those of hadrons.* 

The variable x can also be considered as representing the fraction of 
the nucleon momentum carried by a parton. In DIS, the 4-momentum 
transfer q is large enough for the mass of the quark to be neglected in 
comparison with its energy. Hence, considering the 4-momentum of the 
quark after the scattering and using the fact that the quark mass is 
negligible before the interaction, 


(Dquark +9)? = 0 
that is, 
Pouark + 2Pquark * q +° =0 (9.28) 
Letting Pquark = £p, we can solve for x from eqn 9.28 as 


2 
a, (9.29) 

2p:q 
Now, p- q is a Lorentz invariant, so we can evaluate it in any frame we 
wish. Consider the nucleon rest frame, in which p = (M,0), and use 
eqn 9.29: 


Q? 


= Sa (9.30) 


T 


This value of x is the same as that obtained from eqn 9.27, so we can 
consider the variable x either as representing the mass fraction of the 
nucleon carried by a quark or as the momentum fraction of the nucleon 
carried by the quark.* 

Before we can start the discussion of the dynamics of the scattering 
process, we need to evaluate one more piece of kinematics, namely the re- 
lation between the laboratory energies and the CMS scattering angle 0*. 
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4 Barly experiments used ‘single-arm 
spectrometers’, which could only meas- 
ure the scattered lepton momentum 
and not that of the hadronic jet. 


> Strictly speaking, this is only correct 
in the ‘infinite-momentum frame’, in 
which the nucleus has an infinite mo- 
mentum along the collision axis so that 
we can neglect transverse components 
of the momentum. 
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6 Note that the variable y is only defined 
in the lab frame. 


"We will consider electron neutrinos in 
the following discussion, but in the DIS 
regime the masses of the leptons are 
negligible, so the same results hold for 
muon neutrinos. 


SIn principle, there are five possible 
couplings, each with a different experi- 
mental signature. See Further Reading 
to follow this up. 


The Lorentz transformation from the CMS to the laboratory system for 
the scattered and initial leptons (we assume 8 = 1) is 


Eip = YE* (1 + cos 6*) 


9.31 
Eiab = yE*(1 +r 1) ( ) 


We define the scaled fractional energy transfer for the scattered lepton 
to be y = v/E, so 0 < y < 1. Then we can relate the CMS scattering 
angle 0* and y as 


y = —(1—cos6*)® (9.32) 


NI = 


Having completed our discussion of the kinematics of lepton—nucleon 
scattering, we can now look at the dynamics. We will start with neutrino 
probes in Section 9.4.1, because the spin structure is simpler for this 
case than for the case of charged lepton probes, which we will consider 
in Section 9.5. 


9.4 Neutrino interactions 


We are now ready to use the QPM to explain the DIS data for neutrino 
probes.” We will proceed in the following steps: 


(1) Calculate the elastic scattering cross sections for 
(a) Dee > Dee 
(b) vee > vee 


(2) Generalize to the elastic scattering of (anti)neutrinos on (anti) 
quarks. 


(3) Generalize to the case of (anti)neutrinos scattering off nucleons 
consisting of quarks and antiquarks. 


As discussed in Chapter 7, weak interactions are parity-violating; the 
angular distribution in 8 decay is maximally parity-violating and the 
leptons are emitted polarized. The parity violation is a V—A interaction, 
where V and A stand for vector and axial vector couplings, respectively.® 
We consider for now only virtual W exchange (i.e. we neglect Z ex- 
change) and so we have a pure V—A interaction. In the high-energy limit, 
this leads to a very simple spin structure for the process whereby the 
W bosons couple only to negative-helicity leptons and positive-helicity 
antileptons, where the helicity is defined by the normalized projection 
of the spin s onto the momentum p of the particle: 


p's 


H= 
|p| |s| 


(9.33) 


so that for spin-5 fermions the eigenvalues of helicity are +1 and 


—1. Therefore, the spin structures for the interactions Dee > Dee and 


Z5 AS 


(b) e 
= >> Av 


WP 


Vee — Vee are as shown in Fig. 9.5. For the interaction in Fig. 9.5(a), 
the overall initial state must have an angular momentum about the z axis 
(the initial direction of the De) given by the quantum number J, = 1. 
The interaction proceeds via the creation of a virtual W. The spin of 
a real W is measured to be 1 and we will assume that the spin of the 
virtual W is also 1. Hence the initial state must have J = 1, J, = 1 and 
by conservation of angular momentum the final state will have the same 
angular momentum quantum numbers. However, we know that the scat- 
tered leptons will be in eigenstates of helicity (positive for the 7 and 
negative for the e7); ie. the Pe will have its spin along its direction of 
motion. Hence, considering a 2’ axis that is rotated from the z axis to 


lie along the final De direction of motion, J, = Z. Similarly, the electron 
has J,, = $. Therefore, the amplitude for the reaction in Fig. 9.5(a) 


can be found by projecting out the spin states onto the rotated axes. 
This requires the rotation matrix for spin-4 particles (see Chapter 2 and 
Exercise 2.4): 


cos = — sin lg 
d= 2 2 (9.34) 
i i ly os lg 
sin 5 cos 5 
The elements of this matrix give the amplitude for a state with a given 
quantum number m to be found with a value of m’ after a rotation by 


a polar angle @. 


9.4.1 Cross section for neutrino—electron elastic 
scattering 


We are now in a position to calculate the cross section for elastic vee 
and Pee scattering. From the spin structure for the reaction Dee > Dee 
shown in Fig. 9.5(a), the amplitude as a function of polar angle can 
be calculated by projecting the spin states onto the rotated axes. This 
projection changes the electron (De) state from m = 4 tom! = Z, and 
hence the amplitude as a function of polar angle is given by’ 


A(6) = Gist j2(9))” (9.35) 
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Fig. 9.5 Spin structure for (a) Dee 
scattering and (b) vee scattering. The 
left (right) diagram is before (after) the 
scatter. 


We have chosen to apply the spin- 4 


rotation matrices to the Pe and e states 
separately. The same result could have 
been obtained by considering the com- 
bined system and using spin-1 rotation 
matrices. 
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10We can neglect the electron mass 
since p* >> Me. 


IlThe calculation of the correct nu- 
merical factors requires an applica- 
tion of the Feynman rules. This is 
described in the standard graduate- 
level textbooks—see Griffiths in Fur- 
ther Reading for an example. 


Fig. 9.6 Definition of the angles used 
in the crossing symmetry argument. 


and, substituting for the appropriate element of the rotation matrix from 
eqn 9.34, we get 


2 

A(@) = (cos 30) = Za + cos 8) (9.36) 

The cross section is proportional to |A(0)|?. We can use eqn 9.32 

to change variables from 0 to y. From the two-body phase space, the 

cross section should also be proportional to (p*)?, where p* is the CMS 

momentum. We define s to be the CMS energy squared, and then!? 

(p*)? = s/4. The amplitude of the charged-current weak interaction is 

proportional to the Fermi coupling constant Gp. Hence, from eqn 9.36 

and putting in the correct numerical factor,!! we get the differential 
cross section for elastic Pee scattering: 


do(Dee + Dee) _ Gis 


T (1 y)’ (9.37) 


Similarly, for the elastic scattering reaction vee, from the spin struc- 
ture shown in Fig. 9.5(b), the angular momentum of the initial state 
must have Jz = 0. The possible values of the total angular momentum 
quantum number J are therefore 0 or 1. 

We can calculate the amplitude for the reaction vee > vee, given the 
amplitude for the reaction Dee —> Dee. From the solutions to the Dirac 
equation, we saw that we can formally represent the states that appar- 
ently have negative energy as positive-energy states travelling backwards 
in time. Hence we can use crossing symmetry to relate the amplitudes 
of reactions with a particle exchanged for its antiparticle. This states 
that for two diagrams related by crossing, the structures of the matrix 
elements are the same and we have only to replace the momentum of 
the incoming (outgoing) particles with minus that of the outgoing (in- 
coming) antiparticles. From Fig. 9.6, we can rewrite the amplitude for 
Dee scattering (eqn 9.36) in terms of the 3-momenta and the magnitude 
of the CMS momentum of the particles, p*, as 


(p*)? + Py: P3 
Asl) = ——_>—- 9.38 
O= (9.38) 
and, in the CMS, p3 = —py, so we can rewrite this as 
(p*)? — Pi Pa 
Arl) = =—— 9.39 
O= p (9.39) 
We can rewrite eqn 9.39 in terms of 4-vectors as 
Pı ` Pa 
A;z(0) = 9.40 
(0) = Ss (9.40) 


and then use crossing symmetry to find the amplitude for ve scattering 
by the substitution pı © —ps: 


A, (6*) = oP (9.41) 


We use (p3 + pa)? = p3 + pẹ? + 2ps - pa and we can neglect the masses. 
Therefore, p3 -p4 = (2p*)?, and we can see from eqn 9.41 that the 
amplitude for vee scattering is isotropic. Hence, putting in the factors of 
Gp, s, and overall normalization as for Pee > Dee, we get the differential 
cross section for vee elastic scattering as 


da(vee —> vee) Gis 
dy © n 


(9.42) 


In summary, the 7.e~ scattering cross section has a factor of (1 — y)? 
whereas the vee™ cross section has a factor of 1. A similar argument can 
be used to obtain the cross section for Pee™ from that for Dee™, and we 
find that Pee™ scattering has a factor of 1 whereas veet scattering has 
a factor of (1 — y)?. 


9.4.2 Neutrino—quark scattering 


The next step is to generalize the results for neutrino-electron scatter- 
ing to the case of neutrino—quark scattering. The universality of the 
charged-current weak interactions means that the weak interaction has 
the same strength for quarks as for electrons (see Chapter 7). Unfortu- 
nately, we do not have free quarks available as targets, so we have to use 
the quarks that are confined in nucleons. One of the main techniques 
when doing dynamic quark modelling is to consider the kinematics of 
elastic scattering of a neutrino with a quark that carries a fraction x of 
the nucleon momentum (see Fig. 9.7). 

The Lorentz-invariant CMS energy squared of the neutrino—quark 
system is given by 


= 4g(p*)? = xs (9.43) 


Hence we can write down the cross sections for ve(De) elastic scattering 
on q (q) by analogy with the ve (Pe) elastic scattering cross sections!” 
(eqns 9.37 and 9.42) by the substitution!’ s > § = zs: 


do(veq —> veq)  Ghsx 


dy T 


da(Veq + Peq) _ Gisx 


dy T 
(9.44) 
do (veq > veg) _ Gisx 


dy T 


do(VeG@ + PeF) _ Ghsax 
dy © T 
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Fig. 9.7 Neutrino—quark scattering 
kinematics. 


We ignore threshold effects, which 
can be important for the case of charm 
production near threshold. 


13 The variable y is unchanged here be- 
cause it depends only on the neutrino 
quantities, not on those of the quark. 
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Fig. 9.8 Neutrino and antineutrino 
cross sections on nuclei as functions 
of y. From [71]. 


9.4.3 Neutrino—nucleon cross sections 


We can now put together all the pieces and calculate the cross section 
for neutrino—nucleon scattering in the DIS regime. The basic assumption 
of the QPM is that the cross section for scattering leptons off nucleons 
is given by the incoherent sum of scattering off free quarks. This as- 
sumption is only valid for large values of Q? (Q? > 1GeV?). We will 
discuss the justification for this assumption when we consider the the- 
ory of strong interactions in Section 9.6.3. Let q(x) be the probability 
distribution function of quarks with a momentum fraction x; i.e. the 
probability of finding a quark with momentum in the range (x, x + dz) 
is q(x) da. Similarly, let q(x) be the equivalent distribution for anti- 
quarks. The QPM makes no prediction for q(x) or q(x), so we have to 
obtain q(x) from fits to experimental data. Nevertheless, the QPM does 
make many clear predictions that can be tested experimentally. With the 
above assumptions, we are finally able to write down the cross sections 
for neutrino and antineutrino scattering off nucleons as 


d'o(veN) Gsxr 


dedy Za =) + -y ae) (9.45) 
- an = ERa y)a(x) + Ẹ(2)] (9.46) 


We can compare these predictions with the measured y distributions for 
neutrino and antineutrino beams [115] shown in Fig. 9.8. 

If the nucleons contained only quarks and not antiquarks, then we 
would expect the neutrino data to be constant in y and for the antineut- 
rino data to show a (1—y)? dependence. The data can be fit by a mixture 
of constant and (1 — y)? components, which tells us that nucleons are 
composed of quarks and antiquarks. The proportion of quarks to anti- 
quarks can thus be estimated from these data. The fact that the data do 


doldy' v,0)N 


fit the QPM prediction for the y distribution is a very important test of 
the theory. It tells us that the neutrinos are interacting with free spin-5 
particles (the quarks and antiquarks) according to the parity-violating 
V—A interaction. We can estimate the total cross sections by integrat- 
ing the differential cross sections in eqns 9.45 and 9.46. The y integral 
is trivial to perform and the x integral gives a constant of the order of 
unity (depending on the unknown q(x) distribution). If we assume that 
the q content of the nucleon is negligible, we get 


(9.47) 


This prediction is compared with the experimental data [115] in 
Fig. 9.9.14 The neutrino cross sections are found to be larger than the 
antineutrino cross sections, as expected if the nucleons consist mainly 
of quarks as opposed to antiquarks—but not by as much as a factor 
of 3, which indicates that there are some antiquarks in protons. More 
significantly, the fact that the cross section scales like s is telling us that 
the quarks are behaving as point-like particles. If the quarks had a finite 
size, then the cross section would be suppressed by a form factor. 


9.4.4 Parton distribution functions 


In the previous section, we have seen that the QPM can explain some 
of the key features of neutrino—nucleon interactions. However, the quark 
distribution functions are not predicted by the model and have to be 
determined from experimental data. In this section, we will explain how 
this is done and discuss some useful consistency checks of the theory. 
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l4For a fixed target nucleon, s X 2my 
E,, where Ep is the neutrino beam 


energy. 


Fig. 9.9 Neutrino and 
cross sections. From [115]. 


antineutrino 
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15 This is neglecting the ‘longitudinal’ 
structure function Fy, which is ex- 
pected to be a good approximation at 
large values of momentum transfer Q?. 
Note also that some older analyses of 
DIS data use the structure function Fy 
rather than Fg, where Fg = Fp — 2z F), 
but this hides the evidence that Fy, is 
small. 


16We will return to this later in the 
chapter. 


First, it is convenient to rewrite the v nucleon cross sections given by 
eqn 9.46 in a form more suitable for comparison with the experimental 
data. We use the relationship 


1+(1-y? 


2 
a (9.48) 


g(x) + (1 — y)?G(x) = [a(a) + 4(2)] 


and substitute into eqn 9.46 to obtain 


THEN) = GEE fig) + qo) + 


ae) e EA) 


We can compare the cross-section formula of eqn 9.49 with a general 


(9.49) 


phenomenological formula that is allowed by Lorentz invariance:!° 
d?o (ZN) o Gise [2E 1g?) 1+(1-y) 
dxdy 2 2 
oe = 7 (9.50) 


£ F3(x,q”) 2 


l= (l= o£ 
The so-called structure functions Fə and F} can be determined by fits 
to the experimental data. Note that in general the functions F; can 
depend on q? as well as x. The QPM does not predict any dependence 
on q? and to a first approximation this agrees with data.'® Then, from 
a comparison of eqns 9.49 and 9.50, we can obtain the quark probability 
distribution functions as 


> ale) = 
> ale) = 


where the sum runs over the different quark flavours. Or, equivalently, 
we can write the inverse relation to give the structure functions in terms 
of the quark distribution functions: 


(52 ale) + G(x x)| 
0) -25 late) )- g(x x)] 


So far in the discussion, we have not defined precisely which quark 
flavours we are considering for the quark distribution functions. We as- 
sume that the nucleons contain only the light u, d, and s quarks and 
the corresponding antiquarks (i.e. we neglect the heavy-quark c, b, and 
t content). 


E + Pl) 


(9.51) 


Ble ele 


AS 


x 


7 ra] 


(9.52) 


The Feynman diagrams corresponding to the interactions of neutri- 
nos and antineutrinos on these light quarks are shown in Fig. 9.10, 
from which we see that neutrinos interact with d and s quarks and 
ū antiquarks, whereas antineutrinos interact with u quarks and d and 
5 antiquarks. It is conventional to define the quark distribution func- 
tions as referring to protons. (It should be noted that, at this stage, we 
are ignoring the differences between the weak eigenstates and the mass 
eigenstates for the quarks. This is explained in Chapter 7.) 

We assume that the s and 5 quark distribution functions are the same 
in neutrons and protons. Therefore, using eqn 9.52, we can write the 
structure functions for proton targets in terms of the quark distribution 
functions as 


F5” (£) = 2x|d(x) af aulx) 4 s(x)] 

FS? (a) = 2ar/u(x) + I(x) + 3(x)] 

FY? (x) = 2[d(x) + s(x) — a(z)| (9.53) 
F3” (x) = 2[u(x) — a(x) — d(x) 


For a neutron target, we will assume isospin (SU(2) flavour) symmetry; 
i.e. we assume that the d quarks (d antiquarks) in a neutron have the 
same distribution function as the u quarks (u antiquarks) in a proton. 
Then we can simply use the substitution u + d and u © d in eqn 9.53 
to write down the neutron structure functions in terms of the quark 
distribution functions as 


PY" (x) = 2a[u(a) + d(x) + s(x)] 
FS" (x) = 22[d(x) + u(x) + s(x) 
FY" (x) = 2[d(x) — 5(x) — a(x)] 


For an ‘isoscalar’ target, i.e. one with equal numbers of neutrons and 
protons in an isospin J = 0 state,!” we can write down the structure 
functions from the average of the proton and neutron structure functions 
(eqns 9.53 and 9.54). We also assume s(x) = 5(x), because s and 5 quarks 
and antiquarks have to be created together by the strong interaction, 
which conserves strangeness. We have 


FYN (x) = z[u(x) + d(x) + G(x) + d(x) + 2s(2)] 
ay = Lv) + d(x a(x) + d(x) — 2s(x 

Fy ee aaa T eae) 2s(x)] (9.55) 
Fy” = u(x) — u(x) + d(x) — d(x) + 2s(x) 
FN = u(x) — (ax) + d(x) — d(x) — 2s(x) 


It is convenient to divide the quark distribution functions into ‘valence’ 
and ‘sea’ parts: 


u(x) = u(x) + us(x) (9.56) 
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Fig. 9.10 Antineutrino and neutrino 
scattering off different flavours of 
quarks. 


17 As is the case for many easy-to-use 
targets in scattering experiments, e.g. 
carbon. 
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Fig. 9.11 Experimental test of the 
Gross—Llewellyn Smith sum rule. The 
dashed line is a fit to the experimental 
data [98] for F3(x) and the solid line is 
the integral. 


where, by definition, us(x) = u(x). Then, if we consider an average of 
neutrino and antineutrino data, we have 


F(x) = u(x) — u(x) + d(x) — d(x) (9.57) 
Hence, by the definition of valence and sea quarks, we expect 


F3(x) = uy (a) + d(x) (9.58) 


Since there are three valence quarks in a proton, we can integrate F3(z) 
to get the Gross—Llewellyn Smith (GLS) sum rule [81] 


Sats = f F3(x) dx = l [u,(x) + dy(x)| dx = 3 (9.59) 


This prediction of the QPM is compared with experimental data in 
Fig. 9.11. The measured value for the GLS sum [98] is Scrs = 2.50 + 
0.018(statistical) + 0.078(systematic), which is slightly lower than the 
predicted value of 3 in the simple QPM. However, the difference can be 
understood in terms of higher-order QCD corrections. 

There is another interesting sum-rule check we can get from the QPM, 
namely the Adler sum rule. We define the sum [11] as 


1 Fun — pup 
= 1 2? dr (9.60) 
0 £ 


and we can see from eqns 9.53 and 9.54 that 


Sa = f [u(a) — u(x) — d(x) + d(x)] dx (9.61) 


Hence, from the definition of valence quarks and eqn 9.56, we expect 
that 


Ouse f mtoi (9.62) 


CCFR Data at Q? = 3 GeV? 


ff Fy dx=2.50+0.018 + 0.078 
pail fi po iil 
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and since there are two (one) valence u (d) quarks, we expect Sa = 1. 
This prediction is in good agreement with the neutrino DIS data over a 
range of Q? [134], as shown in Fig. 9.12 

We will examine the measured shape of the parton distribution func- 
tion in Section 9.6.6. We can integrate F to get the momentum fraction 
of the proton carried by all quarks. From eqn 9.55, 


1 
I=} FEN da 
0 
1 


(9.63) 
= f z[u(x) + d(x) + s(x) + u(x) + d(x) + 3(x)] dz 
0 

Therefore, J is the integral of the momentum-fraction-weighted quark 
distribution functions; i.e. it represents the total momentum fraction 
carried by all the quarks. Naively, we might have expected that J = 1 
for the sum of all the quark flavours, but from the fits of the data [4] 
shown in Fig. 9.13, we can see that about half of the nucleon momentum 
is carried by particles that do not feel the electroweak force.'® In terms 
of the theory of strong interactions, QCD, we conclude that this missing 
momentum is carried by gluons. More direct evidence for the existence 
of gluons will be discussed in Section 9.6.1. 
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9.5 Charged-lepton probes 


Complementary data on the quark distribution functions can be ob- 
tained from DIS using charged-lepton (e//) beams. As for DIS scattering 
with neutrino beams, we proceed in the following stages: 


(1) Calculate the cross section for ep elastic scattering. 
(2) Generalize to e-quark elastic scattering. 


(3) Generalize to e—nucleon scattering. 
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Fig. 9.12 Measured value of Sa (see 
eqn 9.62) as a function of Q? from 
neutrino DIS [134]. 


18 Figure 9.13 shows the results for the 
quarks and gluons. We will discuss how 
the gluon distribution is determined in 
Section 9.6.7, but for now the critical 
point is that the quarks do not carry 
all the momentum of the proton. 


Fig. 9.13 Momentum fraction of the 
proton carried by different constituents 
as a function of momentum transfer. 
The curves are the results of fitting the 
parton distribution functions [4] and 
performing the integral in eqn 9.63. 
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19For a pure vector interaction as in the 
case of electromagnetism (or for a pure 
axial vector), it can be shown that at 
high energies there is helicity conser- 
vation, i.e. the helicity of an outgoing 
particle is the same as the incoming 
particle (see Chapter 6). 


9.5.1 Electron—muon elastic scattering 


We will calculate the ep elastic scattering cross section by analogy with 
the cross section for vee scattering. As discussed in Section 9.4.1, we can 
consider vee scattering as due to the exchange of a virtual W boson. 
The strength of the weak coupling constant Gp can be related to the 
dimensionless coupling constant of the weak interaction, g, and the mass 
of the W, My (see Chapter 7): 


g? 


GF S => 
F Mz, 


(9.64) 


At low values of momentum transfer Q, the strength of the weak inter- 
action is determined by Gr, but at high values of Q, of the order of My, 
we have to allow for the effect of the W propagator, which is to modify 
the effective strength of the weak interaction to be 


g? 


Geftective © M2, +Q? 


(9.65) 


Now, in the case of ey scattering, the interaction is due to the ex- 
change of a photon. For a real photon, M, = 0 and the strength of the 
electromagnetic interaction is given by the dimensionless fine-structure 
constant a. Therefore, by analogy with eqn 9.44 for elastic vee scattering, 
the cross section for elastic eu scattering is given by 


do a’s 


dy > @! F (9.66) 
where F is a spin factor that we will now evaluate. The spin factor for 
Dee was easy to evaluate since the V—A interaction ensured that there 
was only one possible spin configuration. In general, if we have different 
spin configurations, then the recipe for calculating the cross section for 
unpolarized beams in an experiment where one does not measure the 
final-state spins is as follows: 


(1) Calculate the cross section for each spin configuration (by cal- 
culating the amplitude A(0) with the rotation matrices as in 
Section 2.3). 


(2) Sum over final-state spins. 


(3) Average over initial-state spins. 


The possible spin configurations!? are shown in Fig. 9.14. 

For the reactions in Fig. 9.14(a) and (b), the initial state has J, = 0 
and hence it has the same spin factor as for vee scattering (see 
eqn. 9.42), giving A(@) constant. Similarly, the amplitude for the re- 
action in Fig. 9.14(c) can be evaluated in the same way as for Dee 
scattering; the electron (muon) states have an initial value of m = — 


N= 


® _ 
e H e H 
= = G A 
(c) 


e H 


<— 


(d) 
=s > 
= =| = 


TT: 


and a final value of m’ = —}. Therefore, the amplitude is given by the 
spin-4 rotation matrix 


2 1\* 1 
AG) = (ma) = (cos 58) = 5 (1 + cos0) (9.67) 

and, using eqn 9.32 to change variables, we get 
Aly)=1-y (9.68) 


Similarly, for the reaction in Fig. 9.14(d), projecting out the spin states 
gives 


2 
2 1 1 
— (7/2 = = 
A(@) = Cae = (cos 30) = 3C + cos 0) (9.69) 
and, again using eqn 9.32 to change variables, we get 
A(y)=1-y (9.70) 
The electromagnetic interaction is not parity-violating and therefore 
in summing over the final-state spins and averaging over the initial spins, 
we can assume the same strength of interaction for left- and right-handed 
helicity states. Therefore, summing the matrix elements squared for the 
final states and averaging for the initial state, we get the spin factor 


F=1+(1-y) (9.71) 


so, from eqn 9.66, the cross section for elastic eu scattering is given by 


do 2ma?s 
es gi [1+ 1- y)? (9.72) 


where the correct numerical factors have to be inserted from a full 
calculation.?° 
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Fig. 9.14 Spin structure for electron— 
muon scattering. 


20Sce Griffiths in Further Reading. 
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2lWe consider explicitly e—nucleon 
scattering here, but of course the 
analysis is identical for j—nucleon 
scattering. 


22 This is assuming parity-conserving 
photon exchange only. 


9.5.2 Electron—quark elastic scattering 


Now we can generalize the cross section for elastic eu scattering to the 
case of electron—quark scattering by allowing for the following: 


e The fractional electric charge of the quark. We shall denote the 
charge of the quark of flavour i by qi. 


e If the quark carries a momentum fraction x, then the CMS energy 
for the e-quark collision is given by the substitution s + § = xs 
(see Section 9.4.4). 


Therefore, from eqn 9.72, the cross section is given by 


d’  2na?xsq? 
drdy = Q4 


[1+ (1—y)?] (9.73) 


9.5.3 Electron—nucleon deep inelastic scattering 


Now we can generalize further to e-nucleon?! scattering in the QPM ina 
similar way as we did for neutrino beams. We assume that e—nucleon DIS 
can be calculated by adding incoherently the cross sections for scattering 
off all quark flavours:?? 


da 2ra?s 
— = [1+ (1-y)” Pa fi(a 9.74 
ag gr UHO- Leh) (07) 
where, as above q; is the charge of quark flavour i, fi(x) gives the prob- 
ability distribution for quark flavour 7, and, as usual, x is the momentum 
fraction of the nucleon carried by the quark. It is convenient to rearrange 
this formula to give 


do  4ra?s 
drdy Qt 


a-o Eene 07) 


We now compare this prediction of the QPM with a general phenom- 
enological formula 
d?o _ Anas 


dredy Qt 


a-ne) R] 070) 


where F; and F (the structure functions) are unknown functions of x 
and Q?. Equating coefficients of (1 — y) and y? between eqns 9.75 and 
9.76, we have the result [59] that 


F(x, Q?) = 22 F(x, Q?) (9.77) 
This important prediction—called the Callan-Gross relation—of the 


QPM arises because the quarks have spin 4 and interact via a vec- 
tor (parity-conserving) interaction. The experimental electron scattering 


° 16<q'<4GeV 
e 5<q'<11 
a 12<q'<16 


1.5 H 4 


0.5 H 4 


data from SLAC, as summarized in [117] and shown in Fig. 9.15, are 
in good agreement with this prediction, which thus provides further 
evidence for the existence of spin-5 quarks. 

Another key prediction of the QPM is obtained from a comparison 
of the QPM prediction (eqn 9.75) with the phenomenological formula 
(eqn 9.76): 


F2(x,Q?) = Ds qe fila) (9.78) 


This means that the QPM predicts that the structure functions de- 
pend on x and not on Q?, ie. that they show a scaling behaviour 
(called Bjorken scaling). This is very directly related to the quarks being 
point-like particles. If the quarks had a finite size, we would expect the 
structure functions to be suppressed by a form factor at large Q?. The 
experimental data are shown in Fig. 9.16 and show approximate scaling 
behaviour. 

There is a range of intermediate x values for which Fə is remarkably 
constant over a very large variation in Q?. This is an extension of the 
earlier fixed-target data, which gave the first evidence for the QPM. At 
low values of x, there is a strong rise in Fh with Q?, whereas at large 
values of x there is a decrease in Fy with Q?. These scaling violations 
will be discussed qualitatively in the context of the theory of strong 
interactions, QCD, in Section 9.6.6. It is interesting to note that because 
Q? = sxy at high energy, i.e. large values of s, fixed values of Q? and y 
correspond to lower values of x than at lower energy. This implies that 
the low-a region is very important for high-energy machines such as the 
Tevatron and particularly the LHC. 


9.5 Charged-lepton probes 255 


Fig. 9.15 Measured value of the ratio 
2aF\ /F2 as a function of x. From [117]. 
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Fig. 9.16 F> 
and HERA 
From [115]. 


from fixed-target 
scattering experiments 
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9.5.4 Further tests of the QPM 


In this section, we will look at the QPM prediction for the structure 
functions and consider some further internal consistency checks of the 
theory. We start with the QPM prediction for the structure function 
(eqn 9.78) and we again assume that nucleons contain only the lightest 
three quark flavours: 


FP (e, Q?) = e Flue) +a) + FE) + dla) + sla) +30} (9:79) 


If we assume isospin symmetry as we did when considering neutrino 


DIS, then we expect up(x) = d,(#) and u(x) = d(x). Therefore, the 
equivalent structure function for a neutron target should be 


PE" æ QP) = a | lula) + ale) + s(x) + a(2)] + Siale) + dey] (9.80) 


Then, for an isoscalar target with equal numbers of neutrons and 
protons, 


5 ie So dog 
qg lula) + ale) 4 


FEN (Q?) = vf 
(9.81) 


which we can compare with the equivalent structure function for neu- 
trino scattering (eqn 9.55), and if we ignore the strange quarks (these 
turn out to be negligible except at very low values of x), then we get the 
prediction that 


FEN (a, @?) = ZFS" (eQ?) 
where the numerical factor of É is just the average value of the square 
of the quark charges. The experimental data (Fig. 9.17) are in good 
agreement with the prediction and hence confirm the charge assignments 
for the u and d quarks. 

Another interesting consistency check is given by a comparison of the 
structure functions for ep and en scattering (eqns 9.79 and 9.80): 


(9.82) 


1 - 
Fy? (2,Q°) — F3" (2, Q?) = zalu(x) — d(x) + a(x) — d(a)] (9.83) 
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Fig. 9.17 Comparison of F(x) from 
DIS muon and neutrino data. The 
muon data have been multiplied by 
a factor of 18 so that the compari- 
son with the neutrino data provides a 


test [115] of the prediction of eqn 9.81. 
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We now split the quark distribution functions into valence and sea 
components: 


Fy” — F3"(x,Q?) = nul) — d(x) + 2[a(x)—d(x)]} (9.84) 


Then, if we integrate eqn 9.84, we obtain the Gottfried sum rule 


1 e en 
n=] e —F (z,Q°) dz 
0 
(9.85) 


=f 5 {u(2) dy(x) + 2[u(x) — d(x)]}dx 
0 


In the QPM, there are two valence up quarks and one valence 
down quark in the proton, so if we have isospin symmetry, then we 
should expect Ig = 4. The experimental data [18] are shown in 
Fig. 9.18 and clearly do not agree with this prediction. This can be 
understood in terms of the Pauli exclusion principle. Antiquarks are 
created by the strong interaction at the same time as quarks (the 
strong interaction conserves quark flavour). The Pauli exclusion prin- 
ciple forbids the creation of a quark in the same quantum state as one 
of the existing valence quarks. Since there are two (one) valence up 
(down) quarks in a proton (neutron), it is therefore easier to create 
dd quark pairs than ūu pairs. Hence the value of Ig should be less 


1 
than 3° 


9.5.5 Electroweak unification at HERA 
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mediated by photon and Z° exchange (neutral current), and e+p > vX, 
mediated by W= exchange (charged current). It can be seen that for 
Q? > MẸ, the four cross sections are of comparable magnitude. At 
smaller Q?, the 1/Q? of the photon propagator dominates and the 
neutral-current cross sections increase rapidly, while the charged-current 
cross sections become approximately constant in Q?. 


9.6 QCD introduction 


We have seen that about 50% of the momentum in a proton is car- 
ried by particles that do not feel the electromagnetic or weak force, 
and we have tentatively ascribed this momentum to gluons. We will 
examine direct evidence for gluons in Section 9.6.1, including a meas- 
urement of the spin of the gluon. In Section 9.6.2, we will see how to 
use ete annihilation data to measure the number of colours carried 
by quarks. Using this experimental knowledge as input, we can con- 
struct a theory of strong interactions called quantum chromodynamics 
(QCD) based on the symmetry group SU(3). We will give an introduc- 
tion to this theory in Section 9.6.3. We will then introduce the concept 
of ‘running coupling constants’ in Section 9.6.4. We will examine this 
quantitatively for QED and then in a qualitative way for QCD. This 
leads to the concept of ‘asymptotic freedom’, which allows us to per- 
form perturbation-theory calculations in QCD if we have large values 
of Q?. We then consider some experimental techniques for measuring 
the strong coupling constant as(Q?) and show that the experimental 
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Fig. 9.19 Total cross sections as func- 
tions of Q? for the processes ep— eX 
(neutral current, CC) and ep>vxX 
(charged current, CC) measured at the 
HERA collider [82]. The data from 
the H1 and ZEUS experiments have 
been combined. The fit is from the 
HERAPDF2.0. 
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Fig. 9.20 Event display for a 3-jet 
event from the TASSO experiment at 
PETRA [136]. 


measurements are consistent with QCD and provide clear evidence of 
the running of a,(Q?). Finally, we will review some evidence that is 
sensitive to the choice of the SU(3) group for the colour symmetry. 


9.6.1 Direct evidence for gluons 


We have seen strong but indirect evidence for the existence of gluons 
from the momentum sum rule (Section 9.5.4), but there is much more 
direct evidence from ete~ annihilation. In the naive QPM we would 
only expect 2-jet events, but in QCD we can have multijet events. For 
example, 3-jet events will be produced by the gluon bremsstrahlung 
process et+e~ — qqg. Such events were observed at the PETRA collider 
at DESY. An example of a clear 3-jet event in the TASSO detector [136] 
is shown in Fig. 9.20. Many statistical tests were performed to show that 
these events were not just due to fluctuations of ‘2-jet’ events. 

For example, if the events were due to gluon bremsstrahlung, we would 
expect planar events. The measured transverse momentum out of the 
plane of the 3 jets was much smaller than that in the plane, confirming 


TASSO 
Ecm: 35 GeV 


vs 


Ss 


229.80 


the existence of 3-jet events [45]. Apart from confirming the existence 
of gluons, measurements of the angular distribution of the third jet are 
sensitive to the gluon spin. In a sample of 3-jet events from the TASSO 
experiment, the jet energies are ordered Æ! > E? > E’; the event is 
boosted to the CMS frame of jets 2+ 3; the angle @ is defined as shown 
in Fig. 9.21. The results from the TASSO experiment [54] are shown in 
Fig. 9.22, from which it can be seen that the data are consistent with a 
spin-1 gluon but clearly exclude a spin-0 gluon. 


9.6.2 Number of colours 


Now that we have seen compelling evidence for the existence of point- 
like spin-4 quarks and spin-1 gluons, we are almost ready to consider 
the theory of strong interactions, QCD. However, we first need to review 
the evidence that there are three ‘colour’ degrees of freedom for quarks. 
The classic experimental test of the number of colours is the ratio 


a(ete~ — hadrons) 


R= 
a(ete™ > utu) 


(9.86) 


In the QPM, the production of hadrons in ete~ interactions proceeds 
via qq states that fragment with unit probability to two jets. Therefore, 
the fundamental Feynman diagrams for the two processes in eqn 9.86 are 
the same and the only differences are the charges of the quarks (q;) and 
the number of quark colours (Ne). Therefore, in the QPM, we expect 


R= (9.87) 


a(ete~ — hadrons 
l Fe y= =N) a 
a(ete™ > utu) ; 


where the sum runs over all available quark flavours (i.e. those for which 
the CMS energy E > 2m;, with m; being the quark mass for flavour i). 
The experimental data [115] are shown in Fig. 9.23. 
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(a) J2 


J1 


Fig. 9.21 Definition of the 
Ellis-Karliner angle 6 in 3-jet events. 
The jets are ordered in decreasing 
energy and are shown in (a) their 
common CMS and (b) the CMS of 
jets 2 and 3. 


Fig. 9.22 Measurement of the angu- 
lar distribution in 3-jet events [54] (see 
the text for the definition of the an- 
gular variable used) and comparisons 
with spin-1 (vector) and spin-0 (scalar) 
gluons. 
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Fig. 9.23 Measurement of the ratio 
R [115] (see text) and comparisons with 
the QPM and QCD calculations. 


24We have already looked at this group 
when we were considering the static 
quark model for hadrons. In that case, 
we were using an approximate flavour 
symmetry between the light quarks u, 
d, and s. Here we will be considering an 
exact SU(3) colour symmetry. 


Y(25) 


6 à Mark-I 
Mark-I + LGW 
5 ew Mark-II 
è PLUTO 
DASP 
R 4 Crystal Ball 
* BES 
3 | | | | | 
I | | 
3 „J | heL 
3 3:5. 4 4.5 5 
8 
F 
6 
5 
4 
3 ARGUS 4 CLEO v CUSB &DHHM 
2 Crystal Ball CLEO II DASP LENA 
9.5 10 10.5 11 
Vs (GeV) 


The data show sharp peaks corresponding to hadron resonances. In 
this region, the quarks are strongly bound and the QPM is not an ap- 
propriate approximation. However, away from these resonances, R is 
approximately constant and shows the step increases as the CMS en- 
ergy crosses the thresholds for the different quark flavours. The data are 
clearly inconsistent with N.e = 1 and are approximately consistent with 
Ne = 3. We will consider the small discrepancies when we consider QCD 
theory in the next section. 


9.6.3 QCD 


Now we have considered the evidence that quarks come in 3 colours and 
that they interact via the exchange of massless coloured spin-1 gluons, we 
are ready to consider the theory of strong interactions, QCD. The theory 
is based on the gauge group SU(3).74 This SU(3) group is associated 
with 3 x 3 unitary matrices. A general complex 3 x 3 matrix requires 
3x 3x 2=18 real parameters to specify it. As the matrices are unitary, 
they must satisfy UU = I, or 


XO U} Ujk = ik 
j 


(9.88) 
SUR Ue = ik 
J 


For the diagonal terms, this sum is over terms like U;;U;;, which must be 


real (i.e. no imaginary parts), and therefore this yields 3 constraints for 


SU(3). For the off-diagonal terms, both the real and imaginary parts in 
eqn 9.88 must be equal to 0. There are 6 off-diagonal elements, but if the 
(i, j) element of the product is 0, then the (j,i) element will also be 0. 
Therefore, there are 3 off-diagonal elements to consider, each of which 
provides 2 constraints. So the total number of constraints is 3+2 x3 = 9, 
which leaves 9 parameters. Now consider the determinant of eqn 9.88: 


det(U'U) = det(I) = 1 (9.89) 


However, Ut = U™, which implies that det(Ut) = det(UT)* = det(U)*, 
so, from eqn 9.89, det(U) det(U)* = 1, ie. det(U) = e (where ¢ is 
a real number). The special unitary group SU(3) has the additional 
constraint that det(U) = +1. This additional constraint means that we 
need 8 parameters and there are 8 generators for SU(3) and therefore 
8 gluons. We can represent the three colour states of the quarks as 
|r), |b), and |g). The use of the term ‘colour’ can lead to some confusion, 
since it is just a label for a state and has nothing to do with ordinary 
colour. In particular, a state |r)|b)|g) would not be colourless. We expect 
there to be 8 generators for the group and therefore 8 colours of gluons 
from the arguments given above. Although the physics of colour SU(3) is 
unrelated to that of flavour SU(3), the group theory is of course identical. 
We can therefore use the results from Chapter 5 to write down the colour 
wavefunctions of the 8 gluons as shown in Table 9.1. Mathematically, 
there can be a colour-singlet state 


HON + [b)1b) + |9)19)) 


but if such a gluon state existed in nature, then, as a colour singlet, it 
could mediate a long-range strong force. Clearly then, there can be no 
such state for a massless gluon.?° 

QCD is a similar gauge theory to QED in many ways. We can de- 
termine the interactions between quarks and gluons by starting with a 
free field theory for the quarks and then impose local SU(3) gauge sym- 
metry. Consider”® a local gauge transformation specified by Aq(x) such 
that the transformed quark field is 


w(x) = expligsAa(2)Ta |) (£) (9.90) 


where gs will turn out to be the strong-interaction coupling constant, Ta 
are the generators of SU(3), and an implicit summation over repeated 
indices is assumed. The infinitesimal transformations of eqn 9.90 are 


p(x) = [1 + igsAa(x) Tal (£) (9.91) 


In order to keep gauge invariance for the Lagrangian, we are obliged to 
introduce 8 gauge fields (gluons) G, (x). These transform under SU(3) as 


Gi (x) > Gi (x) — OuAa(2) — gsfabcAo(x) Gy, (9.92) 
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ga = 4/5 (DIG 


97 = —in/ 3 (I9)I7) — Ir)19)) 


gs = y/ @ (Ir)lF) + [b)1) — 2I9)19)) 


Table 9.1 Gluon colour wavefunc- 
tions. 


25A meson state qq will be in such 
a colour-singlet state and can be ex- 
changed between hadrons, but the force 
is short-range because of the mass of 
the meson. 


26 For a more pedagogical look at local 
gauge transformations and Lagrangi- 
ans, see Sections 12.1 and 12.4, respect- 
ively. 
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27 Another choice would of course have 
resulted in the identical amplitude. 
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qr dr 


Fig. 9.24 Feynman diagram for 
quark—quark scattering for red quarks 
(t-channel diagram). 


and fabe are the SU(3) structure constants, given by the commutation 
relations 


eae To] = ifabcTe (9.93) 


We proceed in a similar way to QED by replacing the partial derivative 
with the covariant derivative 


D, = Oy + igsTaG® 9.94 
u u m 


Again following QED, we need to add a kinetic energy term for the 
gluon fields: 


Lgluon zRo ky” (9.95) 


pv? a 


where the field tensor is given by 


Fi, = O4G3 — LGG, — Is favcG p, G7 (9.96) 


The Lagrangian contains interactions between the quarks and the gluons 
in a similar way to QED. What is new is that the non-Abelian nature 
of SU(3) leads to the extra term in eqn 9.96, which, when substituted 
into the Lagrangian of eqn 9.95, generates terms proportional to G? and 
Gt. Therefore, the gauge invariance coupled with the fact that SU(3) is 
non-Abelian implies that there are 3-gluon and 4-gluon vertices. As we 
shall see in Section 9.6.4, this makes the theory of QCD very different 
from QED. 

We can use the gluon wavefunctions to determine the relative 
amplitudes for different colour combinations of quark and antiquark 
scattering: 


(1) First let us consider scattering of quarks of identical colour. We 
make the arbitrary choice of using red quarks?’ qq, > qrqr (see 
Fig. 9.24). Colour is conserved at each vertex because colour SU(3) 
is an exact symmetry (in the same way as electric charge is con- 
served in QED). Therefore, we need to consider gluons that contain 
rf in the colour wavefunction. From Table 9.1, we find g3 and gg, 
and we can calculate the relative amplitude for g3 exchange from 


the normalization constants as a3 = (3 (3 and that for gg ex- 


change as ag = V3 ge . Adding the two amplitudes gives a factor 
of Z, 

(2) Similarly, if we consider the case of qrqr —> qrqr, the exchange glu- 
ons are g3 and gg as above, but we need to remember the minus 
sign for antiquarks (in the same way as for negatively charged elec- 
trical particles in QED). Therefore, the amplitude is proportional 

2 


to —3: 


(3) Next, we will consider the amplitude for scattering of two quarks of 
different colours (which for convenience we will take to be red and 


blue). For the process rb > rb (as in Fig. 9.25(a)), we again need 
to consider the gluons g3 and gg. For g3 exchange, we need gluons 


with r7 and bb, i.e. gg and gg, which gives aj = —\/4 4/4 = —4 


and ag = A A = z, so the resulting amplitude is a(rb > rb) = 


— %. The diagram in Fig. 9.25(b) is for the process rb + br and 
involves ane with rb and bF, i.e. gı and g2, which gives a, = 


v3vi=} + and az = igs = Ł, Adding the two amplitudes 


gives a(rb > br) = 1. 


We can now easily write down the amplitude for qq states, remember- 
ing to use the negative sign for the antiquarks. Thus, a(r > rř) = 


a(rr>rr) = 2 and a(rb + rb) = —a(rb > rb) = +. In a similar way, 
we can relate rf bb with rb — br because they bath involve exchange 
of gluons gı and gg. This gives a(r7 > bb) = —a(rb > br) = —1. 


The results for the different combinations of colours are summarized 
in Table 9.2. Now that we have all the pieces, we can easily evaluate 
the colour factors for qq in a colour-singlet configuration. This will turn 
out to be very interesting, since it gives some insights into the origin of 
quark confinement. For the colour singlet, we need 


lag) = a sir) + |bb) + |99)) (9.97) 


We have to allow for the wavefunction normalization in eqn 9.97 and 
for the scattering amplitudes given in Table 9.2. Thus, rr — r7 gives a 


factor of E 3(—§). Allowing for the three colours gives a factor of 


3, which gives —2. r7 — bb gives "A 4 (—1) and we have to allow for 
an equivalent term rr — gg. Again we allow for the three colours, so 
the result is 3(—2) = —2. Adding, we obtain the overall result for the 
colour factor for qq in a colour-singlet state, which is? —$. We can also 
calculate colour factors for gluon coupling g —> gg. In the conventional 


normalization, these colour factors (called ‘Casimir factors’) are given by 


Quarks Antiquarks 

2 2 
a(rr > rr) = 3 alr > rr) = =; 
a(rb + rb) = —= (rb > rb) 
a(rb > br) =1 a(r > bb) = —1 


Table 9.2 Colour factors for quark and antiquark scattering by gluon exchange. 
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(a) ar dr 
93:98 

db I 

(b) Ur q 
91:92 

qd dr 


Fig. 9.25 Feynman diagrams for 
quark—quark scattering for red and 
blue quarks (t-channel diagrams). 


8Different conventions for the defin- 
ition of the strong-interaction coupling 
constant can give results for these fac- 
tors differing by a factor of 2, but the 
results for any cross section are the 
same. 
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29When we calculated the Rutherford 
scattering cross section in Section 9.1, 
we used the Fourier transform to cal- 
culate the amplitude as a function of 
momentum transfer from the known 
Coulomb potential. Here we are per- 
forming the inverse Fourier transform 
to determine the potential, starting 
from the amplitude as a function of 
momentum transfer. 


(a) e e 


Fig. 9.26 Lowest-order Feynman 
diagram for e~e~ — ee” scattering 
(a) and an O(a) correction with an 
ete loop (b). 


30 Our explanation is based on that 
given in Burcham and Jobes (see Fur- 
ther Reading). 


CF = 4 for q > qg and Ca = 3 for g > gg coupling (see Cooper-Sarkar 
and Devenish in Further Reading). 

Note that we have focused on the contribution to the vertex factor 
from the gluon colour, but the overall amplitudes will also pick up a fac- 
tor of the strong-interaction coupling constant and a propagator term. 
Since the gluons are massless like the photon, the propagator term is 
1/q’?, where q is the 4-momentum transfer. In summary, the overall 
amplitude for the interaction of the qq colour-singlet state is given by 


al) ~- ga (9.98) 


In the non-relativistic limit, we can Fourier transform the amplitude 
given in eqn 9.98 from momentum space to position space.?° This gives 
a potential that is negative and scales with distance as 1/r. It turns 
out that this is the only combination of two quarks/antiquarks that 
gives a negative potential. This gives some explanation of why we find 
bound qğ states but not states with net colour like qq. This is suggestive 
of colour confinement, which states that the only stable hadrons are 
colour singlets. However, as we will see in the next section, the strong- 
interaction coupling ‘constant’ becomes large at low values of q?, which 
means that the perturbation theory we have used will no longer be valid, 
so this result should be considered as a qualitative indication of colour 
confinement, rather than a proof. 


9.6.4 Running coupling constants 


In QCD, the effect of the running coupling constant is very important. 
Even in QED, the fine-structure ‘constant’ is not constant but varies 
with the scale Q? of the reaction being studied. This is due to the effects 
of shielding by virtual e+ e~ pairs, which reduces the effective strength of 
the interaction between two charges as the distance increases or, equiva- 
lently, as the scale Q? decreases. A first naive attempt to calculate the 
effects of these higher-order corrections results in meaningless infinities. 
We will see how renormalization theory allows us to overcome this prob- 
lem.°° Consider the case of ee — ee. The lowest-order Feynman diagram 
is shown in Fig. 9.26(a). There is an O(a) correction from the ete~ loop 
diagram shown in Fig. 9.26(b). Evaluation of the loop requires an inte- 
gration of the 4-momentum running round the loop. It can be shown that 
the effect of this integral modifies the amplitude by a factor 1 — I(q’), 
where q is the 4-momentum of the photon and 


n ef dp ef 7 _ ¢z(l—z) 
I(q°) = Ta I. p2 2— j dz (1 — x)ln |---| (9.99) 


m 


The result is formally logarithmically divergent and for now we intro- 
duce an arbitrary upper limit A, but we will see that our final result is 


independent of the value of A. For the case —q? >> m?, we can evaluate 
the integral (see Exercise 9.3) as 


(9.100) 


There are also higher-order diagrams to consider, with two, three, four, 
etc. ete loops. The effect on the matrix element is to introduce a 
multiplicative correction factor 
F=1-1(@) + KEN- P+.. (9.101) 
We can use eqn 9.101 and sum this infinite geometric series to get a 
factor 
1 


soa + (a/37) In(A2/Q?) (9.102) 


where Q? = —q?. We can interpret this result by regarding a as the bare 
coupling constant (which we call ao), to which the measured coupling 
constant is related via 

ao 


= 1+ (a/3m) In(A2/Q?2) 


a (9.103) 
This result shows that the effective coupling constant depends on the 
scale Q? as 


Q0 


[o(Q*)/37] In(A*/Q?) 


We can of course write down an equivalent formula at a different scale u. 
We can then combine these two expressions (see Exercise 9.4) to find a 
relationship between the coupling constants at the two different scales:3! 


a(Q?) = T (9.104) 


a(n?) 
[o(u?) /37] In(Q?/p?) 


This result is remarkable because the dependence on the arbitrary cut-off 
parameter A has disappeared. This means that QED is a fully consistent 
theory up to any arbitrary energy scale.3? The result of renormalization 
predicts that the value of a(Q?) increases slowly with Q?. This effect 
amounts to about 7% when comparing Q = Mz with Q = me, and this 
prediction has been verified. We can understand this result in a qualita- 
tive way by noting that the charge of one electron seen by the other will 
be decreased as a result of screening by the ete~ pairs. At higher values 
of Q?, the photon has a shorter wavelength and so penetrates more of 
the screening charges and ‘sees’ a larger effective charge. 

In the case of QCD, we can have analogous shielding effects from the 
quarks, but there is also an anti-shielding effect from the gluons, since 


a(Q?) = = (9.105) 
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3lHere we have only included the ef- 
fects of ete~ loops, but if Q? is suffi- 
ciently large, we need to consider the ef- 
fects of other leptons and quarks. There 
is an additional subtlety in that we have 
ignored other Feynman diagrams, but it 
turns out that these cancel exactly. 


32 From eqn 9.105, the value of the 
coupling constant will become infinite 
at Q ~ 10°80 MeV, but this would 
only mean that perturbation theory 
breaks down, not that the theory is 
fundamentally wrong. 
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33 At lowest order in perturbation the- 
ory, it is convenient to determine a 
value of Agcp; however, when higher- 
order calculations are considered, it is 
better to measure a value of as(Q?) at 
a particular scale. Conventionally, the 
scale used is Mz. 


e` q 
J g 
et q 


Fig. 9.27 One Feynman diagram for 
the process ete~ —> qqg. 


they have a colour charge (unlike photons, which are neutral). The net 
result is that the running coupling constant for QCD is given by 


alg) = site, (9.106) 
T+ [as (e?) de] (83 — 2m) mQ) 
where ne is the number of active flavours, which depends on the scale 
Q compared with the mass of the different flavours of quarks. At a par- 
ticular value of Q? = Aĝcp, the denominator in eqn 9.106 will become 
equal to 0. This happens when 


127 
(33 — 2ng)as(u?) 


In(A2yop/z?) = (9.107) 
We can then invert eqn 9.107 to obtain an expression for the strong inter- 
action running coupling constant in terms of one unknown parameter 
Agcp: 


127 


a,(Q") = (33 — 2ng) In(Q?/AQcp) 


(9.108) 


For ne < 17, we can see from eqn 9.108 that the value of a,(Q?) decreases 

with increasing Q?, which already gives an explanation for the success of 
the naive QPM—at large Q?, the value of a, becomes small enough for 
perturbation theory to be valid and the QPM can be seen to correspond 
to the processes that are of lowest order in ag. At small Q?, the value 
of a, will approach unity, perturbation theory will break down, and the 
parton model will not even be a useful approximation. 

The QCD-improved parton model can be used to calculate the lowest- 
order QCD corrections to the naive QPM, so, if sufficiently precise 
experiments can be performed, the theory can be tested. 

The value of a,(Q?) can be determined®? from many different 
processes, including the following: 


(1) The ratio R = a(ete~ — 3 jets)/a(ete” — 2 jets). The numer- 
ator involves diagrams (see Fig. 9.27) with gluon emissions from 
the outgoing quark (antiquark) and the result is therefore propor- 
tional to as. Therefore, measurements of Rg at different energies 
can be used to determine a, over a broad range of Q?. 


(2) BR(T > lv,i%), where | can be either e or u. To lowest order 
in @s, we can use the universality of the charged-current weak 
interactions. Apart from the two lepton flavours, the 7 can decay 
to qq. The mass of the 7 is too low to allow decays to cs, and 
therefore the only hadronic decays (for T+) are to udc, where de is 
the Cabibbo-rotated d-quark state. Allowing for 3 quark colours, 
we get BR(r > lv) = 4. There are similar QCD corrections 
to the hadronic decay as in the case of ete~ annihilation, and 
therefore precise measurements of the 7 branching ratio can be 
used to determine a value of as at a scale Q? = m? 


T’ 


(3) Y decays. The most common decay mode of the ground state is 
YT — 3g. QED corrections can give the process Y —> y + 2g, and 
therefore measurements of the ratio of these two processes depend 
on the ratio a,/a. 


(4) Scaling violations in DIS (see Section 9.6.7). 


A summary plot [91] of these different determinations of a, over a large 
range of Q? is shown in Fig. 9.28. The predicted decrease in ag with 
increasing Q? can clearly be seen. The good agreement between the 
different measurements and the QCD predictions shows that the theory 
has now been tested to a precision of better than 1%. 


9.6.5 Experimental tests of the gauge structure 
of QCD 


We have seen that there are simple experimental results that demon- 
strate that quarks come in 3 colours, but there are no equivalent simple 
demonstrations that gluons come in 8 colours. The question we need 
to ask is: what is the evidence to justify the choice of the SU(3) gauge 
group? To some extent, the question has been addressed implicitly by 
the measurements of a,(Q?) as a function of Q?, which were consistent 
with QCD (see Section 9.6.4). However, it is still interesting to see if we 
can do a more direct test of the choice of the gauge group as SU(3). This 
can be done in ete~ annihilation to 3 or 4 jets. These processes involve 
the vertices qqg, ggg, and gggg, and they are therefore sensitive to the 
colour factors for qqg and ggg, which in SU(3) are related to the colour 
factors Cp = 4 and Ca = 3 (see Section 9.6.3). The following are some 


3 
of the variables used in this analysis: 


(1) Charged-particle multiplicity in gluon jets compared with quark 
jets. As the amplitude for gluon splitting is greater than that for a 


0.5 
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Fig. 9.28 Measurement of as with dif- 
ferent processes as a function of the 
scale Q?. From [91]. 
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Fig. 9.29 Results of a global fit to 
ete- data [115] for the colour factors 
Cr and Ca. The contours from fits to 
individual analyses are shown and the 
shaded area is the result for the global 
fit to all variables. The star represents 
the QCD prediction. 


quark to emit a gluon, the multiplicity should be higher in gluon 
jets than in quark jets. Quark jets from b or c quarks can be tagged 
by the long lifetimes of the b and c quarks, and the gluon jet will 
tend to be the lowest-energy jet in 3-jet events because of the 
bremsstrahlung process. 


(2) The angular distribution in 4-jet events. The angles between the 
planes of two pairs of jets is sensitive to the qqg, ggg, and gggg 
vertices and therefore to the colour factors. 


(3) The event shape. This can be parameterized by the variable T, 
called thrust and defined as 


T = max (Re) (9.109) 


where the sum is over all the calorimeter cells or charged-particle 
tracks, with momenta p;. The unit vector n is varied to maximize 
the value of T. An ideal 2-jet event has T = 1 and multijet events 
will have lower values. The thrust distribution therefore depends 
on the gluon radiation and is thus sensitive to the colour factors 
for the different vertices. 


The measurements were performed during LEP experiments, with data 
being taken at (or close to) the Z peak, which provided the highest 
statistics. The clean and well-defined environment in ete~ annihilation 
and the high energy made it possible to identify clean 3- and 4-jet events 
and hence to have small systematic errors. The results of a global fit 
to this data [91] are shown in Fig. 9.29. The data are consistent with 
the choice of SU(3) for the gauge symmetry and clearly exclude other 
choices. 
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9.6.6 Experimental fits to the quark distribution 
functions 


From all the DIS data with electron, muon, and neutrino beams, we 
can perform a fit to determine the quark and antiquark distribution 
functions. Note that although the most precise data come mainly from 
electron and muon scattering, the neutrino data are essential to separate 
antiquarks from quarks. The results of one of these global fits are shown 
in Fig. 9.30. 

We can see that the valence quarks dominate at large x but that the 
sea quarks are important at very low x. The gluon distribution is also 
very important at low x, as discussed in the next section. 


9.6.7 The gluon distribution function 


We have already seen (see Section 9.4.4) that the gluons carry about 50% 
of the momentum of a nucleon, but the question is how to determine the 
shape of the gluon distribution function g(x). This cannot be determined 
as directly as for the quarks, because the gluons carry no electric or weak 
charge and therefore do not interact directly with photons or W bosons. 
We can, however, determine the shape of g(x) from the scaling viola- 
tions, the slow variation of the quark distribution functions with Q? (see 
Fig. 9.16). These scaling violations can be explained in QCD by higher- 
order corrections to the simple QPM. A quark carries strong charge, so 
it can emit a virtual gluon, and a gluon also carries strong charge (note 
the important difference with electromagnetism, where the carrier of the 
force, the photon, is electrically neutral) and can therefore turn into a 
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Fig. 9.30 QCD global fits to distribu- 
tion functions [115] for two different 
scales: Q? = 10GeV? (a) and Q? = 
104 GeV? (b). 
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Fig. 9.31 QCD corrections 
to the QPM. 


ae 


d 


Fig. 9.32 QPM for hadron—hadron 

collisions. The generic labels a, b, c, 

and d refer to the type of particles 
participating in the reaction. 


qq pair. Hence we have corrections to the QPM, as illustrated in the 
Feynman diagrams in Fig. 9.31. 

From the process in Fig. 9.31(a), we expect the quarks to move down in 
momentum, and hence the quark distribution functions will be enhanced 
at low x and depleted at high x. From the process in Fig. 9.31(b), we 
expect an enhancement of quarks and antiquarks at low x. How much 
of this sea of virtual quarks and antiquarks we resolve depends on the 
wavelength of the probe we use. At longer wavelengths (lower momentum 
transfer and hence smaller values of Q?), we do not resolve the quark- 
antiquark pairs and so they do not give a net contribution. Conversely, at 
higher values of Q?, we have sufficient resolution to resolve them. Hence 
we expect that as Q? increases, F(x, Q?) should increase at low x and 
decrease at high x. Qualitatively, this is just the behaviour seen in the 
data (see Fig. 9.16). From a quantitative QCD analysis of this data, we 
can in fact determine g(x), and the result of such a determination [115] 
is shown in Fig. 9.30. These scaling violations are proportional to ag in 
lowest-order perturbation theory, and therefore the QCD fits can also 
be used to provide another determination of ag. 


9.7 Hadron—hadron collisions 


The naive QPM and the QCD-improved QPM can be easily extended 
from lepton—hadron collisions to hadron—hadron collisions. Note that 
these calculations only apply to ‘hard’ processes involving large trans- 
verse momentum so that as is small enough for perturbation theory to 
be valid. At low values of Q?, the process is too complicated for any 
QCD predictions to be made and this is a regime in which we are ob- 
liged to use simple phenomenological models. For hard processes, the 
QPM picture for the reaction is shown in Fig. 9.32. 

At the parton level, the collision is between a parton with momentum 
fraction £a in one hadron and another parton with momentum fraction 
Zp in the other hadron. The probability of finding a parton at a given mo- 
mentum fraction x is given by the quark and gluon distribution functions 
(see Section 9.6.6). The cross section at the parton level can be calcu- 
lated from perturbation theory (using QED, electroweak theory, or QCD 
as appropriate). This picture can then be converted into a quantitative 
prediction in the form of a convolution integral: 


1 1 5 5 
o cd) = dr, dry fa(x, ; a(ab d 9.110 
(op + ed) Dy | ta dey f(x, Q) fol, Q2)6(ab + cd) (9.110) 


where the sum runs over all the parton types, the f(z, Q?) are the parton 
distribution functions, ô refers to the parton—parton collision, which can 
be calculated, and the integrals are over the parton momentum fractions 
in the two protons. The formula has been written for the case of pp 


collisions, but it can obviously be generalized to other types of hadron- 
hadron collisions. The momenta of the partons in terms of the CMS 
energy of the pp collision, ys, are given by pa = ®av/s/2 and py = 
Xpr/s/2. The square of the CMS energy in the parton—parton collision 
is therefore given by 


8 = Eo tal — Protal = (Za + %p)"8/4 — (£a — £b)? S/4 = £azos (9.111) 


9.7.1 Drell-Yan 


An application of this formalism is to the Drell-Yan process (lepton- 
pair production in hadron—hadron collisions). The parton-level process is 
qq—I*I- and the cross section is given by analogy with ete~ — I1" as 


_ G Ana? 


(9.112) 


where q; is the charge of the quark in units of e and the extra factor of 
3 compared with the equivalent equation for e*e~ annihilation comes 
from the fact that only gq pairs of the same colour can annihilate to 
give a virtual photon. By differentiating eqn 9.110 and substituting for 
ô from eqn 9.112, we obtain 


q? 4ra? 


= [fi (£a) fo(ao) + fits) fa(ta)] 35 


d2a(hh > ItI-) 
dza dze 


(9.113) 


It is convenient to change variables to 


1, E+p, 
y=-ln , T= 
2 E- p: 


U | w 


where y is the rapidity. From a Jacobian transformation, we obtain 


da(hh > I1) 
dy dr 


g 4ra? 


= [fi (£a) fo(ao) + file) fa(ta)] 


(9.114) 


Therefore, if we assume approximate scaling for the quark distribution 
functions, we should get the same scaled cross section for different values 
of s. This prediction is compared with experimental data [56] for pp > 
pt X in Fig. 9.33. There are many successful predictions for other 
Drell-Yan processes in pp interactions at the CERN SppS collider and 
the Tevatron and in pp interactions at the LHC. The QCD-improved 
parton model has been used successfully for many hard processes at the 
LHC. Some of these applications will be considered in Chapter 13. 
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Chapter summary 


e The structure functions show approximate Bjorken scaling behaviour; 
i.e. Fə is a function only of x and is independent of Q?. This tells us that 
the scattering occurs off point-like parton constituents (the quarks). 


e The Callan—Gross relation (eqn 9.77) tells us that the quarks have spin Z. 


e The form of the y distribution in neutrino and antineutrino scattering 


i 


is consistent with spin-; quarks and antiquarks interacting via a parity- 


violating V—A interaction. 


e A comparison of neutrino and electron/muon scattering confirms the 


quark charge assignments. 


e Only about 50% of the momentum of the nucleons is carried by quarks, 


the rest being carried by gluons. 


e From e'e annihilation and hadron—hadron interactions, we have also 
seen that there are 3 colours of quarks and they interact with spin-1 


gluons. 


e According to QCD theory, the strong coupling constant a;(Q”) decreases 
with increasing Q?. This allows the successful use of perturbative QCD 
to describe a large number of different experiments, as well as a precise 


determination of as(Mz). 


e All the data are consistent with point-like quarks, but it is always possible 
that future experiments will reveal that quarks are composite ... 


e The parton model can be generalized from lepton—hadron interactions 
to describe hadron—hadron interactions in ‘hard’ processes, i.e. those 
in which there is a large momentum transfer (high Q*). The QCD- 
improved version of this model will be used to make predictions for hard 


interactions at LHC energies (see Chapter 13). 
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Further reading 


e Cooper-Sarkar, A. and Devenish, R. (2004). Deep 
Inelastic Scattering. Oxford University Press. This pro- 
vides a much more detailed and advanced description 
than that given here. 


e Burcham, W. E. and Jobes, M. (1994). Nuclear and 
Particle Physics. Pearson. This includes a more in- 
depth explanation of renormalization theory. 

e Griffiths, D. (2008). Introduction to Elementary Par- 
ticles (2nd revised edn). Wiley-VCH. This gives a 
very clear introduction to the evaluation of Feynman 
diagrams. 


e Halzen, F. and Martin, A. D. (1984). Quarks and 


Leptons: An Introductory Course in Modern Particle 
Physics. Wiley. This is another very good graduate- 
level textbook explaining the theory of the Standard 
Model. It contains a clear explanation of how to 
evaluate Feynman diagrams. 


Södiing, P. (2010). On the discovery of the gluon. EPJ 
H, 35, 3. This is a good summary of this important 
discovery. 


Exercises 


(9.1) Verify the final steps in the derivation of the 
Rutherford scattering cross section (eqn 9.12). 


(9.2) Consider a model of a quark that has a uniform 
distribution of electric charge within a radius ro. 
Show that for gro < 1, the form factor (see 
eqn 9.16) is given approximately by F(Q?) = 1 — 
aQ’ré and determine the value of the constant a. 
The HERA DIS data agree with the Standard 
Model for Q? up to values of ~10* GeV. Use this 
observation to estimate an upper limit on ro. 


(9.3) Perform the integral in eqn 9.99 with the approxi- 
mation that —q? >> A? and verify that you obtain 
the result given in eqn 9.100. 


(9.4) Use eqn 9.104 to derive eqn 9.105. 
Hint: Use In(y/x) = In(y/z) + In(z/z). 


(9.5) Using the data on the y distribution for neutri- 
nos and antineutrinos (see Fig. 9.8), estimate the 
relative contribution of antiquarks to quarks in the 
proton at the typical Q? scale probed by these re- 
actions. Explain why you would expect this ratio 
to change with Q?. 


(9.6) Use the data shown in Fig. 9.17 to estimate the 
fraction of the proton momentum carried by glu- 
ons. How and why would you expect this fraction 
to evolve with Q?? 


(9.7) Draw Feynman diagrams to show how charm 
quarks may be produced in vN and YN inter- 
actions. Explain, with appropriate diagrams and 
paying particular attention to the signs of the elec- 
tric charges of the muons, how certain dimuon 
events may indicate charmed-meson production. 
Which of vN or YN is likely to give the larger 
signal and why? 


(9.8) Consider the Drell-Yan reaction of 7* on a carbon 
target. The carbon nucleus has an equal number 
of protons and neutrons and thus also an equal 
number of u and d quarks. Show that in the 
quark—parton model, the ratio 


— o(ntC > wtp X) 
o(nm-C > utu- X) 


equals + when §/s approaches 1. What value does 
r have to be for small §/s? 

What are the experimental issues associated 
with studying Drell-Yan reactions? 


(9.9) Draw Feynman diagrams for the processes et p > 
DeX and e` p — veX. Hence explain which of these 
two processes would be expected to have larger 
cross sections at large Q? and z. 


(9.10) Draw a Feynman diagram for charm production 


in e p DIS. Hence explain how measurements of 
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(9.11) 


this process could be used to constrain the gluon 
parton density function. 


The QPM prediction for the ratio R (eqn 9.86) is 
modified by QCD. According to first-order QCD 
perturbation theory, the correction is a multiplica- 
tive factor 1 + as(Q?)/(37). Use this result and 
the data for R (see Fig. 9.23) to estimate the 
value of the strong coupling constant as(Q*) for 
Q? ~ 5GeV?. 


(9.12) Consider the Drell-Yan process for protons on a 


p and deuterium (d) target pp > pty X and 
pd — wtp X. Explaining any assumptions you 
make, show that in the QPM, the ratio at given 
values of xı and zə is 


Ra Mule) + a(z1))[a(ee) + d(x2)] 
4u(x1)u(a2) + d(a1)d(x2) 


Oscillations and C P 
violation in meson systems 


This chapter and the next deal with the general subject of ‘oscillations’ 
and CP violation, with this chapter focusing on oscillations and CP 
violation in meson systems and Chapter 11 dealing with neutrino oscil- 
lations. While there are some important differences between oscillations 
within meson systems and neutrino oscillations, they both demonstrate 
quantum-mechanical interference over macroscopic distances and allow 
extremely small mass differences to be measured. Moreover, studies of 
both the neutral kaon system and neutrino oscillations have produced 
results challenging the theoretical orthodoxy of their times. 

The neutral kaon system has been extensively studied over more than 
sixty years, mainly in fixed-target experiments. The CP violation phe- 
nomenon was discovered in this system in 1962. More recently, B-meson 
systems have been studied at ‘B-factories’, which have produced spec- 
tacular results demonstrating C'P-violating effects. There are now four 
neutral meson systems in which these oscillation and C'P-violating pro- 
cesses have been studied: kaons and D, B, and B, mesons. C'P-violating 
effects have also been observed in decays of charged particles. 

CP violation is a necessary condition to understand the observed 
baryon asymmetry of the universe, but the amount of CP violation 
observed in the quark sector is too small to explain this effect. However, 
it is now believed that CP violation in the neutrino sector provides 
the most plausible explanation, so this subject will be discussed in the 
context of neutrino oscillations (Chapter 11). 

All the CP violation effects observed in the quark sector to date are 
compatible with the Standard Model. The real interest in CP violation 
physics in the quark sector is that it usually arises from Feynman dia- 
grams with loops, which naturally makes it very sensitive to new heavy 
particles coupling within the loop. This means that very precise measure- 
ments of CP violation can give access to information about new physics 
at very high mass scales. This is another illustration of how the indirect 
search for new physics via precision measurements is complementary to 
direct searches at machines like the LHC. 


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg. © Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg 2016. Published in 2016 by Oxford University Press. 
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1Some of the first evidence for par- 
ity violation in weak interactions was 
provided by the so-called ‘7-6 puzzle’, 
where what is now known as the Kt 
was observed to decay into two or three 
pions, which cannot happen if parity is 
conserved. 


AM (ps7 1) Lifetimes (ps) 
K? 0.005 89.5 51000 
p? 0.01 0.41 0.41 
B? 0.504 1.52 1.52 
B} 17.7 1.39 1.62 


Table 10.1 Neutral meson properties. 
AM is the mass difference between the 
two mass eigenstates and the lifetimes 
are given for the two mass eigenstates. 


The Q value is the mass of the parent 
minus the mass of the decay products 
and gives the total amount of kinetic 
energy available in the centre-of-mass 
frame. 


10.1 Symmetries 


Since the role of symmetries is essential to understanding the behav- 
iour of neutral meson systems, the reader should be familiar with these 
concepts from Chapter 2. The symmetries we discuss here are parity 
P, charge conjugation C, time reversal T, and their combinations CP 
and CPT. Parity and charge-conjugation symmetries are conserved for 
strong and electromagnetic interactions but not for weak interactions. 
Massless neutrinos and antineutrinos have a definite helicity and only 
left-handed neutrinos or right-handed antineutrinos participate in the 
weak interactions: parity is ‘maximally’ violated! (see Section 7.2.5). 

As will be discussed from Section 10.5 onwards, there is evidence for 
violation of CP symmetry in the neutral kaon system and in systems 
containing heavier quarks. Since relativistic field theories predict that 
the combined symmetry of CPT is conserved, T-violating effects are 
also expected in these systems. 


10.2 Neutral kaon decays and Kı and K> 


The four neutral meson systems that can be studied extensively are listed 
in Table 10.1. Each of these is the lightest neutral meson containing a 
particular combination of flavours of quarks, and so the only available 
decay modes are via the weak interaction. The main thing that makes 
these particles so fascinating is that there are some decay modes where 
the same final state is accessible by both the particle and antiparticle, 
for example K? > ntr and K? — ntr. The properties shown in 
Table 10.1 will be discussed extensively in this chapter—they impart 
striking differences in the ways in which the four meson systems behave. 
We start by looking closely at the neutral kaon system, which has two 
strikingly different lifetimes associated with it, and will then return to 
this table to discuss the properties of the other meson systems. Because 
the kaon is light, it has a limited number of decay channels and so is the 
simplest to consider first. 

For the initial discussion, we assume that CP is strictly conserved. 
As just mentioned, the K° (quark content 5d) and K° (ds) must decay 
weakly. They decay to two- and three-pion final states, and semilepton- 
ically: K? 3 I+ +y,4+ 07 and K? > I7 +D +r, where | is e or p. 
The charge of the lepton or pion in these semileptonic decays can be 
used to determine the strangeness of the decaying kaon. To understand 
the phenomenology of kaon decays, it is first necessary to examine the 
properties of the final states under the operation of CP. 

The semileptonic final states are C P-conjugate states; in other words, 
CP(It +n +r) +l +mH+77 and vice versa—the two- and three- 
pion final states are CP eigenstates. The two-pion final states are 797° 
and mt, with a Q value? of ~220MeV. In the 77° final state, 
because they are identical bosons, the m?s will be in a state of even 
relative angular momentum, L = 0,2, so P = [np(m)|?(—1)% = +1 
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and C = [n-(r°)|? = +1; hence CP = +1. For the mtr” final state, 
P = [np(m)|?(—1)" = +1; the operation of C interchanges 7+ and 
n and O(r*, n7) = (17,77), so ne = (—1)” = +1 and hence again 
CP = +1. 

The Q value of the three-pion final states is only ~70 MeV, so this 
suggests that the most probable value for the relative angular mo- 
mentum is L = 0.° For the 7°7r°r® final state, P = [m(T)]? = —1 
and C = [n-(x°)]® = (+1)? = +1; hence CP = —1. The r*+z~7° final 
state can be considered as a 7° combined with the m+a~ state. With 
CP(n*,a~) = +1 and CP(n°) = P(n°)C(x°) = —1, CP = —1 as for 
the 37° mode. 

To summarize, the 27 final states are C P-even and the 37 final states 
are CP-odd. The important thing to notice is that the Q values of the 
two- and three-pion decay modes are very different. In particular, the 
Q value of the three-pion decay mode is only 70 MeV, which leaves very 
little phase space for these decays, and hence the partial decay rates of 
K? to the two- and three-pion final states will be very different. 

K? and K? are not CP eigenstates, although 


CP|K°)+|K°), — CP|K®) > |K®) 
Since the K? and K? both decay to the same two- and three-pion final 
states, it means they are coupled by virtual |AS| = 2 second-order weak 
transitions such as 
K? e (2r) K?, K°4(8r) 4 K? 

At the quark level, the diagrams that change between K? and K? states 
are shown in Fig. 10.1; they are often referred to as box diagrams. 

This second-order weak coupling ‘mixes’ the K° and K°, meaning that 


the physical kaon states—the states with definite masses and lifetimes— 
must evolve as linear superpositions of K° and K®, i.e. 


IVE) = a(t) K?) + b(t)|K°) 


The physical states of the neutral kaons that are C'P eigenstates can 
easily be seen to be 


Ka) = of 5 (x?) + 18%) 
IKa) = of 5 (K9) = 1%) 


Kı and Kə are orthogonal and expressing these states as linear com- 
binations of K? and K? corresponds to a change of basis from the 
strong-interaction eigenstates to the weak eigenstates of CP. 


(10.1) 


3s the momenta of the pions are 
so low, they would need a very large 
impact parameter to have L = 1. 


u,c,t 
s d 
K? w$ $w K? 
d — 5 
TAA 
s Y d 
KO wet uct K’ 
d S 
W 


Fig. 10.1 Box diagrams for K° + K° 


transitions. 
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4The heavier D, B, and Bs mesons 
also mix in a similar way; however, be- 
cause they are heavier, there are many 
decay modes with large Q values ac- 
cessible to each of the CP eigenstates 
and so the differences in lifetime in the 
heavy-meson systems are smaller than 
for kaons. 


The Kı and K; states contain equal amounts of K? and K° and are 
eigenstates of even (+1) and odd (—1) CP, respectively, since 


CP|K\) > [3 (1°) + |K°)) = +|K1) (10.2) 
CP|K2) > [3 (1K) — |K°)) = -| Ko) (10.3) 


Since Kı and Kə are states of different CP and CP is conserved in their 
decays, they will decay to different final states with different lifetimes: 


K,32n (CP=+41) (10.4) 
K,>3n (CP =-1) (10.5) 


The small phase space available for the Kə — 37 decay (Q ~ 70 MeV) 
means that 79, the lifetime of the K2, will be much longer than 71, the 
lifetime of the K,.4 The Kı and Kə are also expected to have different 
masses. The Kı — K2 mass difference is very important and is discussed 
further below; for the present, the discussion will concentrate on the 
consequences of their different lifetimes. 

The strangeness eigenstates K° and K? can be expressed in terms of 
Kı and Kə by inverting eqns 10.1: 


1x) = fE (K) + IKa) (10.6) 
IRO) = fE (K) — IKa) (10.7) 


An initially pure K° (or K?) state will be an equal mixture of Kı and Ko. 
If the number of K? produced at a (proper) time t = 0 is No, then the 
total number of kaons at a subsequent time t will be 


1 
N(t) = Nole +et/72) (10.8) 


Since 7, >> 72, the Kı component will die away first and at a time 
much greater than the lifetime of the Kı, and only the Ky component 
will remain. If a pure K? beam is produced, it will be found to contain 
a rapidly decaying component, Kı, decaying to two pions, and a slow 
component, Kə, decaying to three pions. Although the kaons produced 
initially are all of the same strangeness, the Kə component that remains 
at long times will be an equal mixture of K° and K°. The measured 
lifetimes of Kı and Kə are 89ps and 51 ns, respectively, corresponding 
to decay lengths of cr, = 2.7 cm and crz = 15.6 m. The Ko lifetime is 600 
times the lifetime of the Kı and at modest energies (i.e. a few GeV) the 
Kı component will decay in a few centimetres but the K2 component 
will travel many metres before decaying; it is therefore possible to make 
essentially pure K2 beams. 
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10.3 Mass differences of neutral mesons 


The discussion in Section 10.2 introduced K, and Ko as the CP 
eigenstates of the neutral kaon system that arise as a consequence of 
second-order weak coupling between K° and K°. It was indicated that 
Kı and Kə must have different masses and it is instructive to look more 
closely at how this arises. We will label the states as |K} etc., but the 
formalism is valid for any of the neutral meson systems in Table 10.1. 

We introduce the formalism in two parts, first by assuming the par- 
ticles do not decay, then introducing a treatment that allows the mesons 
to decay. We continue to assume that CP is conserved. Take a meson 
in a state that is composed of a linear combination of the states |A°) 
and |K?) that have definite strangeness (i.e. are eigenfunctions of the 
strangeness operator). Since |K?) and |K?) are coupled by second-order 
weak interactions, we can describe the kaon system by a pair of coupled 
equations using Schrodinger’s time-dependent equation: 


8 (|K®) |K°) 
ig (120) = (fo) 
_({M — Miz\ /IK°) 
(Mh = M } \|K°) 
where M is the mass of K° and K? (their masses are identical if we 
impose CPT invariance). The off-diagonal term M12 = (K°|Hweak|K®) 
represents the second-order weak coupling between |K°) and |K°). The 
matrix H represents the Hamiltonian of the system and so is Hermitian. 
If we instead consider the state as a combination of |K1) and |K2) (which 


are eigenfunctions of the CP operator), the Schrödinger equation can 
be written in terms of a pair of equations that are no longer coupled: 


o (Kı) , /[|K1) M 0 | Ixy) 

ats -H = 10.10 

"at (ix |Kə) 0 Mj (lK) ora 
The elements of the matrix H’ give the masses of the Kı and Kə and 
can be found from the eigenvalues of H, which are M + |Mj9|. This is 
how the mass splitting of the Kı and Kə arises. The eigenvectors of H 
give the states Kı and K in terms of K° and K® as in eqns 10.1. 


We now turn to the case where the particles are able to decay. The 
time evolution of a neutral meson wavefunction may be written as 


(10.9) 


|K°(t)) = e Mtg #/27| KÀ’) (10.11) 


where the first exponential is the usual plane wave for a state with energy 
E = M. The second exponential is imposed® to give the exponential 
decay for a state with proper lifetime 7 (i.e. width T = 1/7 in natural 
units), so that 


(KK )P xe! (=e) 


5 Although this is standard practice 
for considering meson decays, it is a 
somewhat non-standard use of quan- 
tum mechanics. This second exponen- 
tial means that the state as written 
does not remain normalized; there is a 
further piece to the wavefunction, not 
written here, that represents the part 
of the state that has decayed. 
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6Time 
system. 


in the 


kaon centre-of-mass 


We can generalize eqn 10.11 to a two-state system: 


(1o) =% (%0) : where X = e~ 'Mt-Tt/2 (10.12) 


M and T are 2 x 2 matrices encoding the time evolution of the two-state 
system. |K?) and its antiparticle |K?) are the flavour eigenstates. The 
off-diagonal terms describe the transitions [mixing] between the meson 
and antimeson. We can apply Schrédinger’s equation, idw/dt = Hw, to 
identify the Hamiltonian, as we did previously: 


H=M--r 
2 
Because of the way we have introduced the decaying states, H is not 
Hermitian; however, since any matrix A can be written in the form 
A =H, +iHb, where Hı and Hə are Hermitian, it follows that M and 
T are Hermitian matrices. Also, we impose CPT invariance (a particle 
and its antiparticle have identical mass and lifetime) to give 


Mo, = My, To, = 1 
My, = Mo. = M, Ty, = l2 =P 


So the Hamiltonian simplifies to 


_{M Myp\ ifT Te 


The five fundamental quantities describing the mixing system are M, I, 
|My|, |[12|, and their relative phase arg(My2/Ty2). Mı2 and F12 cannot 
be fully determined, since there is an arbitrary, unobservable phase in 
the wavefunction of |K°). H can be diagonalized as above to give the 
masses and lifetimes of the Kı and Kə in terms of these quantities. The 
mass splitting is a more complicated expression than the one we derived 
above for the non-decaying mesons, but is still approximately +|M19]. 


10.4 Flavour oscillations 


Carrying on with the discussion from Section 10.3, we continue to use 
the kaon system as an example, but these results are applicable to all the 
meson systems. Since Mj, arises from second-order weak interactions, 
the mass difference AM = Mı — Mə ~ 2Mj2 is very small. It can be 
measured by examining the strangeness of a decaying K? beam. Consider 
an initially pure K° state at proper time t = 0:6 


lyt = 0)) =|K°) = Jżur + |K2)) 


The state evolves in time as 


|\b(t)) = ii (eia Pyan) d e(-iMa-T2/2)41K,) ) (10.14) 


where T, and T3 are the decay widths of Kı and Kə. |¥(t)) can be 

re-expressed in terms of |K?) and |K?) by using eqns 10.1: 

1 

2 

me (ainara = el-iMa-Ta/2)¢) \K) 
2 


Wt) = Cm 4 e(-iMa-T2/2)) \K°) 


(10.15) 


Interference will occur between the terms of frequencies Mı and Mə in 
the amplitudes of |K?) and |K°). The K? intensity at time t is 


I(K°) = |(K°|(¢))/? 


Lmt tty l mi pera 
= Fe) 4. (c ) cos(a) (10.16) 


Similarly, the intensity of K® from the same initial K° can be obtained 
from |(A°|(t))|?: 


1 


I(K?) = i 


(eT! + e7™2$) I cos(AMt) (10.17) 


Equations 10.16 and 10.17 imply that the strangeness content of a beam 
initially containing strangeness +1 K°s will oscillate with time (or dis- 
tance in the laboratory). The oscillations will be observable if AM7, ~ 1 
or greater. Figure 10.2(a) illustrates the behaviour of the K? and K? 
intensities given by eqns 10.16 and 10.17. Oscillations are observable for 
a few K4 lifetimes. At times t >> 71, the Kı component will have entirely 
decayed away, leaving only Kə, and the beam will be an equal mixture 
of K? and K? and zero net strangeness. The total kaon intensity at any 
time is given by the sum of eqns 10.16 and 10.17 and is as given by 
eqn 10.8, as it must be. 

The features of the oscillations of each of the four systems are re- 
markably different owing to the different values of the parameters in 
Table 10.1. The intensities from eqns 10.16 and 10.17 are plotted for 
each of the meson systems in Fig. 10.2(a-d), and we describe each in the 
following subsections. 


10.4.1 K°—K® oscillations 


Strangeness oscillations can be observed experimentally by starting with 
a pure S = +1 K? beam produced, for example, by K+ +n — K°+p and 
tagging the strangeness of the decaying kaon by using the semileptonic 
decays K? > It +v +r and K}? > 1- +0427. The Kı — Kə mass 
difference can be deduced from the period of the oscillation: AM = 
3.483 x 1071? MeV and AMT = 0.49. Being less than one part in 1014 


10.4 Flavour oscillations 283 


ol 
O 200 400 600 800 1000 
Time (ps) 


2 4 6 8 10 


o 2 4 6 8 10 
Time (ps) 


Time (ps) 


Fig. 10.2 Intensities of mesons (full 
lines) and antimesons (dashed lines) as 
functions of proper time after the 
production of a pure meson state 
according to eqns 10.16 and 10.17 for 
the kaon (a), D? (b), B° (c), and B? 
(d) systems. 
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T This could equally have been written 
in terms of CKM elements but in two 
generations, Vua = Ves = cosc and 
Vea = —Vus = — sinc. 


8A similar argument applies to the rare 
decay K? + utu. Without the pres- 
ence of the c quark, the branching ratio 
would be expected to be O(a?), where 
a is the fine-structure constant. Allow- 
ing for the existence of the c quark, the 
branching ratio 


BR ~ a?’ (m? — m2?) /M?, 


where me, Mu, and My are the masses 
of the c quark, u quark, and W boson, 
respectively. 


°The first evidence for D-meson oscil- 
lations was found at the B-factories and 
at the Tevatron. 


10We implicitly assume the inclusion of 
charge-conjugate states. 


Mina two-generation approximation, 
the CKM matrix element is cos @¢ and 
the small value of the Cabibbo angle 
results in a large value of this elem- 
ent. Hence these decay modes are called 
‘Cabibbo-favoured’. 


of the K° mass, AM is truly tiny, and is the smallest mass difference 
that has ever been measured. 

A rough dimensional estimate of AM can be made by considering the 
‘box’ diagrams for AS = 2 K} — K” transitions shown in Fig. 10.1. The 
transition is second-order weak with s- and d-quark to u-, c-, or t-quark 
transitions occurring at the vertices. The s and d to u and c couplings 
are the most important at the relatively low energy scale of the kaons, so 


AM ~ (K®|Hyweak| K?) ~ G}m3; cos? 0o sin” bc (10.18) 


where 6c is the Cabbibo angle.” The use of the kaon mass mx ensures 
that eqn 10.18 has the correct dimensions. A full calculation involves 
the evaluation of loop Feynman diagrams (the box diagrams shown in 
Fig. 10.1), with the result 


Gis ing 
3r? 


G 
2 
mọ 


2 2)2 
AM & arine E (10.19) 
where fg ~ 170 MeV is the experimentally determined kaon decay fac- 
tor. In a model without charm, the prediction gives a result ~4000 times 
higher than the experimental result, but good agreement is obtained if 
the charm quark is included in the calculation. Thus measurement of 
neutral kaons gave indirect evidence for the existence of the c quark 
and provided a rough prediction for its mass before it was discovered. 
This is a classic example of how precision low-energy measurements are 
sensitive via loop diagrams to physics at much higher mass scales. 


10.4.2 D°—D® oscillations 


In general, we should expect to observe D°—D° oscillations because the 
same type of box diagrams responsible for kaon oscillations (Fig. 10.1) 
will cause D? + D° transitions. However it turns out to be much more 
difficult to observe oscillations involving charm quarks. First (as for B 
mesons), there are so many possible decay modes that the lifetimes of the 
D; ‘short-lived’ and Dg ‘long-lived’ mesons will be very similar (unlike 
the case for kaons), so we cannot produce a pure sample of the long-lived 
neutral D mesons. Also, the particular values of the CKM elements cause 
the rate of oscillations of D mesons to be much slower than their decays, 
as illustrated in Fig. 10.2(b). Therefore, we need to observe D mesons 
over a time period of many lifetimes to be able to observe oscillations. 
This implies that we need very high-statistics samples of D mesons, and 
the best place to obtain these is using the LHCb experiment at the LHC.® 
The oscillations were studied by measuring’? the time dependence of the 
ratio 


N(D° > Ktn7) 


R= (D0 > K=n*) 


(10.20) 


The decays D° + K*x~ involve the quark-level transitions c > s and 
u — d, both of which have a large CKM mixing angle.'! The decays 


D° — K~nxt involve the quark-level transitions c > d and u —> s, 
both of which depend on the Cabibbo angle as sin@c and hence these 
rare decays are called ‘doubly Cabibbo-suppressed’. We can also have 
decays of D° to the ‘wrong-sign’ Kr decays if the D° oscillates into a D? 
that then decays by a ‘Cabibbo-allowed’ mode to K~z*. Therefore, the 
signature for D? oscillations will be the ratio of ‘wrong-sign’ to ‘right- 
sign’ decays given by R (see eqn 10.20) increasing with time as more 
oscillations occur. From an experimental perspective, we need to 


e ‘Tag’ the initial flavour of the D? (is it a D° or a D®?) at 
production. 


e Measure the lifetime of the D®. 
e Identify the decay products of the D? to find K*+x* events. 


The flavour tagging is done by selecting decays D*+ — D°x*, where the 
charge of the 7+ determines the flavour of the neutral D° at production. 
This is done by first reconstructing D? mesons and then combining them 
with a+ and looking for a narrow peak in the invariant mass spectrum 
of Am = m(D°x*) — m(D°). The lifetime of the D? is determined by 
measuring the flight path L and momentum!” p, so that the decay time 
is given by t = Lmpo/p. The measured value [103] of the ratio R is 
shown in Fig. 10.3. The value of R is increasing with time, as expected 
for oscillations. The oscillations are so slow compared with the lifetime 
that only a fraction of an oscillation period can be observed. 


10.4.3 B°—B° mixing and oscillations 


Neutral B mesons come in two varieties, BY and B®, containing a 


b and either a d or an s quark, respectively.!3 The masses of the 
strong-interaction eigenstates are Mp = 5.28 GeV and Mpg, = 5.37 GeV. 


7.0 
e data 


6.5 — mixing fit 


6:0: Fe ss no-mixing fit 


IS 


5.0 


R(x10°) 


4.5 


4.0 


3.5 


3.0 


Y Leb t tr 


Srp TTT] TT 
© 


> 
a 


10.4 Flavour oscillations 285 


12The momentum of the neutral par- 
ticle is reconstructed from the momenta 
of the charged decay products. 


13The usual convention is that B° 
means BS and the subscript d is omit- 
ted. B? contains a b antiquark, whereas 
D? contains a c quark. 


Fig. 10.3 The ratio R of ‘wrong-sign’ 
to ‘right-sign’ decay modes as a func- 
tion of decay time divided by the mean 
lifetime. The curve shows a fit allowing 
for D? oscillations and is in good agree- 
ment with the data. The dashed hori- 
zontal line shows the ‘no-oscillation’ 
model. From [103]. The data point at 
the largest value of t/7 covers the range 
5 <t/r < 20. 
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Fig. 10.4 Box diagram for B9-B9 
transitions. 


The large mass means that they have many possible decay modes and 
short lifetimes, and so (as with the D mesons, but unlike the kaons), the 
Bı and By cannot be studied separately. 

As discussed in Section 10.3, the condition for observing oscillations 
is that AM T is of the order of unity or greater, where AM is the 
mass difference between the physical eigenstates and 7 is the lifetime. 
Figure 10.4 shows two diagrams for B9-B9 transitions. By direct ana- 
logy with eqn 10.18, the Bı — Bz mass difference must be proportional 
to M% times the appropriate CKM elements, i.e. 


AMp = Mp, — Mp, ~ GMp |V|? Vial? 


B mesons have a relatively fast oscillation compared with their life- 
time, which makes oscillation studies easier than for D mesons, because 
of two factors: first, only top-type quarks run in the ‘box’ diagram 
responsible for oscillations and, second, the lifetime is relatively long 
because of the small value of the CKM matrix elements in the decays 
(Veo and Vab). 

By assuming the lifetimes of B, and Bə are the same, the expressions 
for the B° 4 B° transition probabilities are somewhat simpler than 
those for K? + K? (eqns 10.16 and 10.17): 


P(B? > B®) = Lenati + cos(AMs t)| 
(10.21) 
P(B° + B°) = sore — cos(AMp t)] 


where Tg = 1/rg. Equations 10.21 represent the probabilities of 
observing a B° or B° at some time t after a B? or B° has been created. 

We can tag the flavour of a decaying B meson using the semileptonic 
decay modes. Therefore, if we produce a B° B° pair, we can use events in 
which both Bs decay semileptonically and identify events with same-sign 
leptons (SS) as having oscillated and those with opposite-sign leptons 
(OS) as not having oscillated. We define the usual asymmetry A = (OS— 
SS)/(OS + SS), and from eqn 10.21 we can see that A = cos(AMp t). 
Therefore, if we can measure the frequency of these oscillations, we can 
determine the mass difference between the light and heavy states. 

B°—B° mixing can be detected by producing B° B® pairs, for example 
in an ete collider, and observing their semileptonic decays. At the 
quark level, 


>c+W >ct+pm yy, 


b 
b>é+Wt +e vy 


and one signature for B°-B° oscillations is the observation of like-sign 
muon pairs: “~~. Figure 10.5 shows the fraction of all muon pairs that 
have the same sign, 


N+ + N-— 


F= 
N++ + N=- +Nt/ 


0.5 F 


(like -sign)/(like-sign + unlike-sign) 


versus proper time in the B° system measured by the DELPHI ex- 
periment at the LEP ete~ collider. There is clearly a significant, 
time-dependent, excess of like-sign pairs, which is attributable to B°-B° 
mixing. It can be deduced from these data that AMpg, ~ 3 x 107'° MeV 
and AMg,T ~ 0.7.4 


10.4.4 B, oscillations 


The study of the neutral Bs system has only recently become possible, 
because the best place to study it is in high-energy hadron colliders such 
as the LHC (and the Tevatron before that). The LHCb detector at the 
LHC is a special facility for studying B mesons—Section 10.7.3 describes 
many of the general features of the LHCb detector. 

To study B, oscillations, we first need to identify the flavour of the 
neutral B hadrons at their decay to know if they are either B? or B®. 
Second, we use the fact that the B hadrons arise from bb production 
to infer the flavour that the signal B hadron had at creation from the 
flavour of the other B hadron in the event (this is called opposite-side 
tagging). This can be achieved by several algorithms including the charge 
of the leptons (e or u)!” from semileptonic decays.'® Same-side tagging 
(SST) can also be used to identify the flavour of the B°.'7 The B® are 
identified by a flavour-specific decay mode and the D7 are identified by 
decay modes such as D7 + ¢m~ with ¢ > K*+K~—. The very good K/r 
separation provided by the RICH detectors reduces the combinatorial 
backgrounds in the mass reconstruction. The pion charge identifies the 
flavour of the B®. 

An event with a B? and a B? would be identified as non-mixed, 
whereas an event with two B? or two B® hadrons would arise from 
mixing. Finally, we need to measure the decay time. The B, oscillations 
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Fig. 10.5 Fraction of like-sign muons 
due to B°-B° mixing versus proper 
time in the B? system observed by 
the DELPHI experiment at LEP [72]. 
The curve is the result of a pre- 
diction with AMg, =0.480ps—! 
(3.159 x 10710 MeV). 


14The early observation of this mixing 
was a surprise because it was only ex- 
plicable in the Standard Model if the 
mass of the top quark was very large, 
and this measurement gave the first in- 
dication that the top quark would be 
so heavy. Here again, we see how preci- 
sion low-energy measurements can give 
access to physics at much higher mass 
scales via the loop diagrams. 


15The leptons in this analysis are re- 
stricted to e or p. 


165 _, el D, so a negative (positive) 


lepton arises from the decay of a b (b). 


IT From conservation of strangeness, 
the B® must be produced in association 
with an 5 quark. If this 5 hadronizes to 
form a charged kaon, the sign of that 
kaon identifies the flavour of the B9. 
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Fig. 10.6 Decay time distributions for 
B® for mixed and unmixed events. The 
suppression of events at low values of 
the decay time is due to the trigger 
acceptance. From [104]. 


18 4 mistag occurs when the tagging 
algorithm arrives at the wrong conclu- 
sion. The effect is to dilute the magni- 
tude but not the period of the observed 
oscillations. 


19By making a beam of neutral kaons 
and letting the Kı component decay 
away in flight. 


20 To be described in Section 10.5.3. 
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are very rapid, so excellent decay time resolution is essential. The oscilla- 
tion probability is similar to that derived for kaon oscillations (eqn 10.17) 
after allowing for the effects of the finite experimental resolution and the 
probability of a ‘mistag’.1> We have the characteristic oscillation term 
+cos(AMg, t). The sign is positive for mixed flavour at production and 
decay and negative for the case where the flavour is the same at pro- 
duction and decay. The characteristic oscillations are clearly seen in the 
data [104] shown in Fig. 10.6. 


10.4.5 Regeneration 


A further interesting phenomenon related to oscillations, consequent on 
the very different lifetimes Tı and Tə in the kaon system, is regeneration 
(the discovery paper is [61]). It is an experimental consideration that 
must be controlled carefully in order to measure CP violation, as we 
will see in the next section. Regeneration occurs when we make!’ a 
beam of pure Kə and then let that interact by the strong interaction by 
hitting some material. Because the strong interaction is involved, the K° 
and K? components of the Kə must be considered. K? and K? interact 
differently in nuclear matter. The K? cross-section is greater than the K? 
cross-section because K? has a d antiquark, which can annihilate with 
a d quark in a nucleon to produce hyperons (e.g. K? +p > A? + r+). 
There is no equivalent interaction for the K°. Regeneration is not the 
effect of a single collision with the material, it is a quantum-coherent 
effect—the amplitudes of interactions with many nuclei in the material 
all sum together. The kaon is not deviated in its path and does not suffer 
any energy loss. 

The consequence of this is that after a Kə beam passes through an 
absorber, the amounts of K? and K? will change and the beam will 
no longer be pure Kg; it will contain some amount of Ki. Ky > mT 
decays will be observed again after the absorber. This phenomenon is 
called regeneration. Figure 10.8 shows data from the KTeV experiment 
at Fermilab,?? which is an example of the regeneration effect. 


Regeneration measurements allow the sign of AM to be determined. 
It is found that AM = Mə — Mı = 3.483 x 1071? MeV, ice. that Kə is 
heavier than K4. 


10.5 CP violation (part 1) 


In addition to the interesting oscillation effects described up to now, the 
neutral meson systems also exhibit the fundamental effect of violation 
of CP symmetry. This effect has now been seen in a wide range of 
places, including decays of charged mesons. There are three main ways 
in which CP violation can be observed, which are listed in Table 10.2 
for future reference. There is now strong evidence that the CP violation 
we see is due to a mechanism proposed by Kobayashi and Maskawa 
connected with the CKM matrix. We will follow the historical route 
in our description here and discuss the first experimental evidence for 
CP violation, which came from the kaon system, then the Kobayashi- 
Maskawa mechanism, then briefly other C P-violating effects with kaons, 
which were small and required high-precision experiments. There then 
followed an extensive period, which is still in progress, where strong C'P- 
violating effects in B-meson systems became experimentally accessible, 
leading to tight constraints on the parameters in the CKM matrix. 


10.5.1 Discovery of CP violation 


The first experimental evidence of CP violation came in 1964 when long- 
lived neutral kaons were observed to decay to two pions [61]. The result 
of this experiment, which is entirely consistent with what is known today, 
was the first evidence for CP violation and was completely unexpected 
at the time: the decay K2 — 27 was expected to be strictly forbidden 
by CP conservation, as described above. Figure 10.7 shows data from 
a more recent experiment (NA31 [112]) with high statistics and shows 
the number of K? — ntr decays versus time. The fast exponential 
component is from the C’P-conserving decay Kı — 27; the constant 
component that remains after t ~ 157, shows that CP = +1 mtr” 
states appear to be produced in a region where only CP = —1 Kə states 
should exist. Shortly after the discovery of CP violation in m™7~ final 
states, separate experiments found 7°7° states with invariant masses 
consistent with being kaons produced in the region where only the Kə 
(i.e. no Ky) should exist. Both of these effects are violations of CP 
symmetry. 

At this point, the question that arises when thinking of the processes 
as Feynman diagrams is: Where is the CP violation happening? It could 
be either in the Feynman diagrams representing the mixing or in those 
representing the decay?! (or both). The answer, it turns out, is that it 
is mainly in the mixing in the kaon system. 
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(a) In mixing 
(b) In decays 


(c) In interference between 
several decay diagrams 
leading to the same final 
state 


Table 10.2 Main ways of observing 
CP violation. 


21 Types (a) or (b) in Table 10.2. 
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Fig. 10.7 Example of CP violation in 
K — ntr” decays from NA31 [112]. 
The data are the number of K > ata— 
decays as a function of time. Below 
10 lifetimes, the main feature is CP- 
conserving Kg decay. Above about 15 
lifetimes, the CP violation effect is vis- 
ible, because the decay rate does not 
keep falling exponentially. (Interference 
is possible between these two ways in 
which the kaons can decay. The inset 
shows the difference between the data 
and a fit without the interference. A 
fit with interference nicely follows the 
oscillations of the data.) 


2250 the formalism we have used up to 
now is still valid. 


2314 turns out that the Ky is the 
heavier of the two kaon states, so un- 
fortunately the names By and Ky, do 
not correspond to each other. 
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To incorporate the effect of CP violation into the mixing formalism 
that we have been developing, we introduce two more state labels, Ks 
and Ky, to represent the physical states that decay with the short and 
long lifetimes, respectively. If CP were conserved, |Ks) = |Kı) and 
|Ky,) = |K2), but to include CP violation, we maintain the definitions 
of Kı and Kə as CP eigenstates, defined by eqn 10.1.2? The K, state 
represents the component in the beam once all the short-lived kaons 
have decayed away, which is mostly the CP = —1 Ko, but includes an 
admixture of a Kı component, which is what produces the mr decays. In 
the other meson systems, the nomenclature is slightly different, although 
the concept is exactly the same. Since the B mesons all have lots of decay 
modes, the lifetimes are not very different, and so the states are labelled 
based on the mass splitting as light? By and heavy By. Returning to 
Fig. 10.7, the level of CP violation can be characterized by measuring the 
decay-rate ratio np; = T(K > mtn7)/T (Kg > mtn); the current 
average of measurements is 74_ = (2.232 + 0.011) x 107°. 


10.5.2 Semileptonic charge asymmetry 


The next indication of CP violation was an observation of an asymmetry 
in the semileptonic decays of the Ky: 
(ky > atv) -T (ky, > tl) 


E T(K > aly) + (A, 3 tla) ` (10.22) 


The choice of measuring a ratio like this minimizes experimental sys- 
tematic effects; nevertheless, the experimental set-up needs to have the 
least amount of material as possible in the path of the particles because 
the interaction cross sections of t+ and m7 are different?+ and could 
affect the measurement if too large. The semileptonic asymmetry Ay is 
measured to be (3.32 + 0.06) x 1073. If CP were conserved, it would 
be zero. The fact that this asymmetry is present demonstrates that the 
CP violation is occurring in the mixing of the kaons rather than in their 
decay. Another way of saying this is that this measurement shows that 
(Kıl Ks) # 0. 

We can formulate the combined effects of mixing and CP violation 
using the relations in eqns 10.1 between the K,, Kz and the K°, KO, 
and then expressing Kg and Ky, in terms of K° and K° as follows: 


|Ks) = p|K°) + q|K°) 
7 (10.23) 
|.) = p|K°) — q|K°) 
where p and q are complex coefficients with |p|? + |q|? = 1. (K1|Ks) = 
|p|? — |q|?, which is zero if CP is conserved. Considering the charge 
asymmetry Az again, let the amplitude for K? > a Itv be f, so 
the amplitude for K° > rtl~p is f*. It is now possible to show that 
Ay = |p|? — |q|? and therefore that the non-zero measured value of A, 
indicates that the Kg and Ky are not orthogonal to each other, and 
hence that CP violation is occurring in the mixing. 

An alternative way to write the effects of mixing and CP violation is 
to define Kı and Kə in terms of K° and K® with eqns 10.1 as before; 
and then define Ks and Ky, in terms of Kı and Kə with a small complex 
impurity parameter € as 


gya (10.24) 
I lel? 
|K2) + €| 1) 


(10.25) 


K = 
A 1+ [ef 


With this definition, p/q = (1 + &/(1 — &, AL = 2Re(&)/(1 + |e?) ~ 
2 Re(č), and 7,_ = č. From 1964 until around 1999, all the CP-violating 
effects measured could be characterized by the single parameter €. 
Theories were proposed to explain CP violation when it was first 
discovered. In particular, a new AS = 2 ‘superweak’ interaction specific 
to the kaon system was considered. The CP violation in the kaon system 
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24For similar reasons to those ac- 
counting for the difference in K? and 
K? interaction cross sections in Sec- 
tion 10.4.5. 
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25 This requires a third generation and 
it is interesting to note that this was 
proposed after CP violation had been 
observed but long before either of the 
third-generation quarks had been dis- 
covered. 


26Tn reality one big beam made from a 
target at z = 0 that was collimated into 
two beams side by side. The regenerator 
beam was attenuated to avoid a huge 
rate of decays, which is why no Ky, is 
visible in that beam. 


was studied extensively over nearly forty years before evidence for CP 
violation in B° systems—which will be discussed shortly—was found. 
The next sections will outline how CP violation can be accommodated 
in the Standard Model by introducing a complex phase into the 3 x 3 
CKM quark mixing matrix (see Chapter 7).?° 


10.5.3 CP violation in K? decay 


A signature that could be used to gain insight into the mechanism caus- 
ing CP violation is to look for CP violation in the decay of particles—it 
was first observed in the decay of Ky, particles. If all the CP was occur- 
ring in the mixing, then the decays of Ky, to mm should have exactly the 
same features as the decays of Kg to m7, because, in both cases, it is just 
Kı decaying. In particular, the ratio of decays to n?n? /rtr should be 
the same for both Ky, and Ks. There was an extensive programme of 
experimental research to measure this, and the results were expressed in 


terms of the double ratio R = |noo|?/|n+—|?, where 
A(Ky > 1°17?) A(Kk, > rtr") 
1 =~ or n+- = Fire. us ed 
A(Ks > 77?) A(Ks > ttr") 


and A(K —> ...) represents the amplitudes of the decays. Unfortunately, 
most of the K — mr decays produce the pions in a single isospin state 
(I = 0), and so, from isospin symmetry (see Section 5.2.1), the ratio 
n?n? /rtr" is the fixed value of £ no matter what the production mech- 
anism of the m7 state is. However, a small fraction of the 77 are made in 
an isospin I = 2 state, which has a different value of the n?n? /ntr" ra- 
tio, and so it is possible to use the double ratio R to detect CP violation 
in decays; the effect is very small, however. 

Experimentally, this double ratio was a good quantity to measure 
since it allowed tricks to cancel systematic errors, for example the use 
of the same detectors to measure the Ky and Kg. The NA31 experi- 
ment at CERN took data alternately in Ky, mode (where the target 
was far upstream to allow the Ks to decay before reaching the experi- 
ment) and Kg mode (where a target close to the experiment, with a far 
less intense proton beam, was used; the Kg target was moved on rails 
to different positions to reproduce a decay distribution similar to the 
almost-flat Ay, beam to minimize acceptance differences). The KTeV 
experiment at Fermilab [93] used two simultaneous Ky, beams”° with an 
absorber in one of them to regenerate a Kg beam (its predecessor, the 
E731 experiment [116], used a similar technique). Figure 10.8 shows the 
reconstructed decay position distribution along the beam direction z and 
illustrates the difference in decay distributions from the Ky, and Ks. It 
also illustrates the phenomenon of kaon regeneration. NA48 (a successor 
to NA31) had both a Ay and a Kg beam, produced from separate pro- 
ton beams. Both KTeV and NA48 allowed simultaneous measurement 
of Ky and Kg to remove any small time-varying systematic effects in 
the detectors. The layout of NA48 is shown in Fig. 10.9. 
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Fig. 10.8 Distribution of decay posi- 
tions z of rtr events along the beam 
direction in the KTeV experiment [93], 
which used two beams of Ky, and a re- 
generator (a block of material) situated 
at z = 125m in one of them to produce 
Kg particles. The regeneration effect is 
clearly visible and the difference in the 
decay distributions due to the different 
Kg and ky, lifetimes is very apparent. 
The acceptance of the detector varies 
slowly with z, which produces the non- 
flat Ky, distribution. Decays between 
110 and 158m are used in the 
analysis. 


= = 


Fig. 10.9 Layout of the NA48 experi- 
ment at CERN to measure CP viola- 
tion in kaons. By comparison of the 
components in the detector with a 
collider experiment, the detector com- 
ponents are really very similar, just 
rolled out in a line rather than in a 
cylinder, so that the particles encoun- 
ter the tracking + magnet first, then 
the calorimetry (electromagnetic, then 
hadronic), and finally the muon detect- 
ors. The kaon experiments are very long 
and narrow, reflecting the large Lorentz 
boost to which the ~100 GeV kaons are 
subjected. From [113]. 
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27 Liquid argon and lead for NA31, lead 
glass for E731, caesium iodide crystals 
for KTeV, and liquid krypton for NA48. 


281 can be shown that ny- =ete, 
noo = €— 2c’, and R= 1 — 6 Re(e' /e) 


All four of these experiments had extremely high-resolution electro- 
magnetic calorimeters?’ capable of measuring high rates of particles, 
which were vital to separate the 7°7° decays from the far more nu- 
merous C’P-conserving Ky, + nar? decays. The +a decays were 
measured with a spectrometer (magnet + drift chambers) and back- 
grounds from three-body decays (n*+a~ 7°, m*eFv, and m*+pu*Fv) were 
removed by (a) looking at the transverse momentum pr distribution of 
the events (most two-particle events that are background have larger pr, 
indicating another particle that was missed), (b) checking that the mo- 
mentum in the spectrometer p was inconsistent with the energy in the 
electromagnetic calorimeter E (since E’/p is close to 1 for electrons), and 
(c) checking that the reconstructed invariant mass of the two tracks was 
close to mg. The experiments also had sophisticated multilevel triggers. 

The combined result of all these experiments is R = 0.9899 + 0.0012. 
Since this is not consistent with R = 1, this is evidence for CP viola- 
tion in the decay of the Ky, particle, or, equivalently, it shows that the 
Kə state can decay to two pions directly. This excluded the superweak 
interpretation of CP violation and was consistent with the CKM model 
of CP violation, to be described next. To incorporate this into the for- 
malism above, the parameter č is replaced by two parameters?® €(~ €) 
and e’, where e ~ 2 x 107° characterizes the CP violation in the mixing 
and ¢’ ~ 3 x 10~® characterizes CP violation in the decay (type (b) in 
Table 10.2). 


10.6 CP violation in the Standard Model 


The CKM matrix was introduced in Section 7.3.3 to explain the rotation 
between quark states |d), |s), |b) (flavour eigenstates) produced in strong 
interactions and the |d’), |s’), |b’) states that couple with the W boson. 
The CKM matrix elements are needed in weak decays involving quarks. 
As also mentioned in Section 7.3.3, Kobayashi and Maskawa found that 
by extending the matrix to be a 3 x 3 matrix, they were able to insert 
a non-trivial complex phase into it. The presence of the phase, which 
appears in the transition amplitudes, can cause T violation, since 


Tie ik) +> eiki+é 


and hence, via the CPT theorem, CP violation is expected. 

This does not work with a 2x2 matrix, in which the unitarity condition 
imposes that the complex phases can be removed without affecting any 
observables, and so the discovery of CP violation along with the work of 
Kobayashi and Maskawa was an early indication for the third generation 
of quarks. Products of the CKM elements appear in the meson decay 
amplitudes, and the phase from the CKM matrix accounts for the CP- 
violating effects seen so far. The formalism allows insights into the likely 
magnitude of CP-violating effects in other processes by examining the 
relation between the elements. 
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The unitarity of the CKM matrix V requires that V'V = 1. In terms 
of the individual elements, this?® gives nine relationships: 


X [WP =1 (j =1,2,3) (10.26) 
i=1,3 


Y Wve = SO vyv =0 G,h=1,2,3; 74k) (1027) 
i=1,3 i=1,3 


The six relations of eqn 10.27, each of which is a sum of three complex 
numbers, form a ‘unitarity’ triangle in the complex plane. It can be 
shown that 


[Im (Vii Vim ViwVin)| = [Im (Ving Vint Vink Via) | = J (10.28) 
irrespective of k,l, m, n, and all six triangles have the same area, A = ZJ ; 


independent of any phase convention. J is known as the Jarlskog 
invariant. 

Figure 10.10 shows one of the triangles involving the CKM elements 
responsible for strange and charm (D-meson) decays and one responsible 
for B decays. The angles of the triangle in Fig. 10.10(b), which represents 
the unitarity constraint Vua Via + Vea Va + Via Va = 0, are large. We will 
see in Section 10.7.1 that the angles in the unitary triangle determine the 
magnitude of the C'P-violating effects. Therefore, the large angles in the 
triangle describing B° decays correspond to large C P-violating effects. 
Conversely, the angles in the triangle describing K° and D° decays are 
small. The CKM ansatz therefore predicts large CP violation in B- 
meson systems but very little in K? and D° systems. In fact, any CP 
violation in D? systems is expected to be almost negligibly small.3° Any 
CP violation in D? decays at a level of more than a few parts in a 
thousand would be strong evidence for new physics. 

A second way to explore whether phenomena beyond the CKM ex- 
planation of CP violation exist is to make accurate measurements of the 
structure of the CKM matrix. By making separate measurements of the 
angles a, 3, and y shown in Fig. 10.10, and the lengths of the sides of 
the triangle, we can check whether they are all consistent. Inconsistency 
could indicate, for example, that the 3 x 3 matrix is just a part of a 
larger matrix, or that the CKM formalism is not correct. 

To find processes in which C'P-violating effects could be present, two 
things are needed: 


(1) The process must have diagrams involving all three generations of 
quarks (if this is not true, we could shift the complex phase into 
the part of the matrix of the missing generation, so it would not 
appear in the decay amplitudes). 

(2) There needs to be more than one diagram to get to the final 
state (if there is only one diagram, the phase drops out when 
calculating |A|?). 


29y7t — (v*)T. 


Fig. 10.10 Unitarity triangles 
responsible for (a) strange and charm 
decays and (b) B decays. The angles 
a, B, and y are also known as ¢2, #1, 

and ġ3, respectively. 


30The reason for this is that the mix- 
ing and decays of K? and D° involve 
mostly the first two quark generations; 
CP violation is observable in the kaon 
system because of the long lifetime 
of Ky. 
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(a) 


u 


Fig. 10.11 Two Feynman diagrams 
for Ky, decays to two pions. The 
interference between diagram (a) and 
diagram (b) (called a ‘penguin 
diagram’) generates ‘direct’? CP 
violation (i.e. there is CP violation 
without mixing). 


31 The parameters p and q can have dif- 
ferent values in each of the meson sys- 
tems, although, from unitarity, |p|? + 
|q|? = 1 in each case. 


The CP violation in the mixing, proceeding via diagrams in Fig. 10.1, 
satisfies these criteria because there are several such diagrams and some 
of them involve the t quark inside the loop, and so some of the dia- 
grams involve all three generations. Kaon decay can occur by diagrams 
of the form shown in Fig. 10.11; the more complicated ‘penguin’ dia- 
grams that can contain quarks from the third generation in the loop 
provide a mechanism for CP violation in the decay. 


10.6.1 Mixing with CP violation 


We look again at the mixing in meson decays and derive expressions 
for the time variation of the states, but now including CP violation as 
included by eqns 10.23. We have previously used symbols related to the 
kaons, even though many expressions have been valid for all the meson 
systems.*! For variety (and because we will use the formulae to describe 
B-meson physics next), we use the symbols for the B mesons here. This 
involves the substitution from (K°, K°, K1, K2, Ks, KL) to (B°, B°, By, 
Bo, By, By). From eqns 10.23, the light and heavy eigenstates written 
with B symbols are 


|BL) = p|B°) + q|B°) 


Be casi BOs iB (10.29) 
|Bu) = p|B°) — q| B°) 
Inverting these, we obtain 
|B?) = |Bu) + |Bu) 
2p 
(10.30) 
|B°) = |Bu) — |B) 
2q 


Equations 10.30 can be used to write the time evolution for a state |~) 
that was created as a B° at time t = 0: 


el at/2—_-iMt 


WO) = (6 AM+/2 Br) + eiAM+/2 Bu)) (10.31) 


2p 


We have used the (good) approximation that the By and By have the 
same lifetime here. Using eqn 10.29, we can express this as 


el st/2.-iMt 


W) = [e152 (p|B°) + al B°)) 


2p 


+0 14M1/? (p18) — 41 B°))] (10.32) 


— eT at/2g-iMt cos(ame/2) |B?) + if sin(AMt/2) |B°) 
P 


(10.33) 


We can now evaluate the probability for a state that was initially a B° to 
be found as a B° at time t and similarly the probability that it oscillates 
into a B®: 


|(B° a(t)? = e- "8" cos*(AMt/2) 


KEIO = eTe sin? /2) a 


The amount of CP violation in the mixing of B mesons is found to be 
small. In the limit of no CP violation, |p|/|q| = 1 and®? By and By 
would be orthogonal. 


10.7 CP violation (part 2) 


The CKM formalism for CP violation with one complex phase 6 can be 
easily made to fit the experimental CP violation effects described up to 
now, and indeed the value of 6 is not strongly constrained by these. One 
reason is that strong-force (QCD) effects on the theoretical prediction of 
e are difficult to calculate,** so it is not easy to relate it to 6. However, 
the CKM formalism gives predictions (as a function of the assumed value 
of 6) of CP violation in other systems, and, owing to the values of the 
CKM elements, some of the effects in the B-meson systems can be large. 
Although the kaons have the advantage that the Ks and Ky, have very 
different lifetimes, there are several tricks for measuring CP violation 
with B mesons that do not have this feature. 

As stated earlier, we need to find a situation where (a) all three quark 
generations can be involved and (b) several routes lead to the final state, 
so that when we add the amplitudes together and square to get the 
observable quantity, there is a dependence on the complex phase in the 
CKM matrix. 


10.7.1 CP violation in time-dependent 
asymmetries 


The first technique that exploits this is elegant and gives a large effect, 
called the time-dependent asymmetry. It is our first example of type (c) 
CP violation from Table 10.2. It has been studied in detail at facilities 
called B-factories and involves making a B° B° pair from the decay of an 
(4S) bb meson, which we will look at shortly. It can also be studied at 
the LHCb, experiment which we will also examine later in this chapter. 
A final state that is a CP eigenstate is required, one to which both a B° 
and a B° can decay; a good example of this is B? > J/wKg. A schematic 
of the entire process*4 is shown in Fig. 10.12. It turns out that the CP 
violation in the mixing of B° (in the box diagrams in Fig. 10.4) is very 
small, but the final state has two routes by which it can be made from 
a B°: one in which the B° oscillates to a B° before decaying and one 


10.7 CP violation (part 2) 297 


32-The relationship 
(Bi|Bu) = |p|? — lal? 


carries over from the kaons 


331t is a complex theoretical computa- 
tion and there are several large terms 
that nearly cancel to give a small over- 
all number, which consequently has a 
large uncertainty. 


34 Although the Kg is needed to prod- 
uce a CP final state, we can draw this 
in the Feynman diagram as either an sd 
or a ds; although there is CP violation 
in the mixing of the kaons, it is small 
compared with the main source of CP 
violation in this technique and can be 
neglected. 
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Fig. 10.12 Schematic of the processes 
involved in the time-dependent asym- 
metry measurements. The Y(45) pro- 
duces a pair of B mesons (shown left). 
One decays in a way that tags whether 
there is a b or 6 quark in the meson 
(shown below), the other decays to a 
CP eigenstate that is accessible from 
either the B° or B® (shown above). 
Provided there are two routes with 
similarly large amplitudes to get to the 
same final state, interference between 
them can reveal the complex phase in 
the CKM matrix. 


35In the following equations, fop is 
shortened to f. 
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b d d 
- b T 
W pt 
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where it does not. The amplitudes for these two parts of the decay will 
interfere, and the sensitivity to the angle ô in the CKM matrix is large. 

Considering decays of B mesons to states of definite CP (labelled as 
fop), we define the amplitude for B° + fcp as Ay and the amplitude 
for a B? decay to the same state as A f. It is conventional to define the 
parameter 


_ Ana 

Af p 
From eqn 10.33, we can write down the amplitude representing the decay 
of a B® state, with wavefunction w(t), to the fop state:?° 


Ny (10.35) 


(flab(t)) = eTe | Ay cos(AMt/2) + Ay sin(AMt/2) 


Using the definition of Af from eqn 10.35, this can be rewritten as 
(FICE) = A peT e Mt [cos(AMt/2) + id, sin(AMt/2)] (10.36) 
Similarly, the amplitude representing the decay of a B° state, with 


wavefunction Y(t), to the same fcp state is 


(Fit) = Age et cos(AMt/2) + ~ sin(AMt/2) 
f 


Assuming that |Ap|? = |Ap|? as expected in the Standard Model, and 
that |p| ~ |q| because the CP violation in the mixing is small, we can 
evaluate the decay rates as (Exercise 10.3) 


fl)? = [Ag Pe-T[1 — 21m(Ap) cos(AMt/2) sin(AMt/2)] 


(10.37) 


(10.38) 
IIPOJ = [Aj PeT [1 + 2Im(Az) cos(AMt/2) sin(AMt/2)] 


We define the C’P-violating asymmetry in the usual way, 


KFO = AYO)? 
eau KEE + FY)? (10.39) 


and, substituting from eqns 10.38, we find 


The J/~K® state is a CP eigenstate (with eigenvalue ncp = —1), 
so is suitable for a time-dependent asymmetry analysis. We can use the 
CKM matrix to predict the phase factors with or without mixing via 
the box diagram (see Fig. 10.12). In the box diagrams, the important 
contribution comes from exchange of virtual top quarks owing to the 
large value of V. Note that the asymmetry depends on the phase of Ap. 
The phase factors in the box diagram come from the CKM elements and 
contribute V,7V,2,. Since Viz is real, the non-zero phase arises from the 
factor of V4. Hence the amplitude of the C P-violating term is given by 
Acp = Im(V,2). Now if we look at the unitarity triangle (see Fig. 10.10), 
we can see that the phase of Vig = 6. Hence Acp = sin 28.36 


10.7.2 B-factories 


Since the B-meson system is expected to be a prolific source of physics, 
many recent studies of B mesons, including the time-dependent asym- 
metries, have been made at two ‘B-factories’. These are high-luminosity 
ete colliders, KEK-B in Japan (with the Belle detector) and PEP-II 
(and the BaBar experiment) in the USA, operating at a centre-of-mass 
energy of 10.58 GeV, the mass of the Y(4S) resonance. The Y(45S) is a 
C = —1 state that decays almost entirely to BaBa pairs with a branch- 
ing ratio of 49% to B° B®.” Studies at e+e~ B-factories continue with 
the new Belle-2 experiment at the upgraded Super KEK-B accelerator. 

Both machines are ‘asymmetric’ in that the laboratory energies of 
the electron and positron beams are different—specifically, at PEP-II, 
the low-energy ring operates at 3.1GeV and the high-energy ring at 
9.0GeV. The asymmetry between the energies of the two rings means 
that the centre-of-mass system (CMS) moves at a velocity of 8 = 0.5 
in the laboratory and the resulting Lorentz boost allows resolution of 
the B decay vertices, which would otherwise not be resolved if mesons 
were produced at rest in the CMS. This is illustrated in Fig. 10.13. 
Nevertheless, the detectors must have extremely good vertex resolution, 
since the average distance from production to decay, bycrg, is only about 
260 um even with the Lorentz boost. 

Because B° B° pairs from the decay of Y(4) are produced essentially 
at rest in the CMS, and the Y(4S) is a state of definite (odd) C, the 
initial state is 


T(45)o=-1 > Vi [|B°(p)B°(—p)) — |B°(—p) B*(p)) 
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36A similar analysis for the case of 
J/YK? gives a similar result with the 
sign changed. 


37 Bo and B® are too massive to be 
produced from Y decays. 


Fig. 10.13 The Lorentz boost due to 
the finite velocity of the CMS in an 
asymmetric B-factory allows the 
resolution of the B-meson decay 
vertices. The distance between the two 
decay vertices allow the time between 
the two B° decays to be determined. 


300 Oscillations and CP violation in meson systems 


Fig. 10.14 Data from the Belle experi- 
ment: (a, b) numbers of events with a 
B? and with a B° tag as a function 
of At and (c, d) the time-dependent 
asymmetry. The data are shown separ- 
ately for the CP = —1 (e.g. J/~Ks) 
(a, c) and CP = +1 (J/WK_z) (b, d). 
From [48]. 


where p is the CMS momentum of one of the Bs. The oscillations that 
occur after production are quantum-correlated: since the state must be 
odd under the interchange of the two mesons, they cannot both be B° or 
B? simultancously—they must oscillate together. The coherence is lost 
once one of the mesons has decayed and oscillations can be observed—the 
decay of one meson starts the clock for the time-dependent oscillations 
of the other. Experimentally, the time At = tı — tg is determined from 
the distance between the decay vertices, as sketched in Fig. 10.13. 

The scheme of the processes that are selected in the detector to 
measure the time-dependent asymmetry is shown in Fig. 10.12, which 
we have partly discussed already. We consider events that decay as 
(4S) + B°B°® + fopfiag, where fop is a CP eigenstate (shown in 
the top part of Fig. 10.12) and ftag is a B meson that is tagged as either 
B? or B? through a semileptonic decay as shown in the bottom part of 
Fig. 10.12. Let n(B°)(n(B°)) be the number of tagged B°(B°) events 
observed as a function of At. We can redefine the asymmetry Acp from 
eqn 10.39 in terms of event numbers as 


_ n(B°) — n(B°) 
Acp = n(B°) + n(B9) (10.41) 


Figure 10.14 shows the numbers of tagged events (a, b) and Agp (c, d) 
from the Belle experiment [48] as functions of At for the processes 
B —> J/wWKg (a, c) and B+ J/YKı (b, d). The dataset from Belle has 
700 million B? B° events. Similar results are also obtained at the other 
B-factory by the Babar experiment at SLAC with a sample of 460 million 


= — 
a À 
Q rom 
10 10 
(= S 
= e 
S ~~ 
Q o 
g £L 
= E 
D D 
> > 
Lu W 
> > 
= = 
oO D 
= E 
= E 
> > 
a A 
< < 


-6 —4 -2 0 2 4 6 
At (ps) 


B? B° events. It can be deduced from these and other measurements that 
sin 26 = 0.671 + 0.023. 

The B-factories also studied time-dependent asymmetries with other 
final states. The B > J/wWKg described here is particularly clean, with 
diagrams that are easy to calculate theoretically being dominant. This 
mode is an example of a diagram where the quark-level decay is b > ¢cs, 
and gives access to the angle 3. Measurements of the other angles of the 
unitarity triangle are more involved but can be done. For example, the 
angle a can be determined from the time-dependent analysis of B° > 
ata decays, but this is more challenging because of the small branching 
ratio for this decay mode and backgrounds from other decay modes. In 
addition, the theoretical analysis of this mode is more complicated. 

There are about eight combinations overall of quark decays that can 
in principle be studied with different final states. Highly significant (over 
50) C'P-violating effects have been observed in the following final states: 
J/YK, nK, oK, foK, KEK- Kg, ntr, or, D* D-, and D*+ D*-. In 
some of these cases, there are more than two particles in the measured 
final state, which in general do not have definite CP, and a certain 
region of the Dalitz plot (see Chapter 2, page 36, for an example) must 
be selected to isolate the decays in the CP eigenstate. 


10.7.3 LHCb detector 


The B-factories using the Y (4S) can produce Ba mesons but they are be- 
low threshold for producing B, mesons. Hadron colliders offer a prolific 
source of B mesons, including the B, mesons, and this will be discussed 
in this section. The angle y can be determined most cleanly from meas- 
urements of direct CP violation in B+ + D° K+ decays, or from effects 
involving the B, mesons. 

The LHCb experiment [99] is a dedicated B-physics facility at the 
LHC, established following the success in studying B physics at the 
Tevatron experiments. Because of their relatively small mass compared 
with the CMS energy, 6 hadrons are produced predominantly along the 
beam directions, in what is called the ‘forward direction’. One side of an 
LHC interaction region has been instrumented with detectors specifically 
optimized for (a) good spatial resolution (which gives good decay time 
resolution) to distinguish and measure B decays, (b) excellent particle 
identification (e.g. in distinguishing B —> ntr decays from background 
B- K*r_), and (c) a very efficient trigger for selecting B events. 

The spatial resolution is achieved with a silicon vertex detector (see 
Section 4.6.2).3° Because of the geometry of measuring particles in the 
forward direction, the LHCb silicon vertex detector fits inside the vac- 
uum beam pipe. The detectors are split in two parts so that they can be 
moved closer to (further away from) the beam during normal operation 
(filling the accelerator with particles) with motors to avoid exposure to 
too much radiation. Thin foils surround the vertex detectors to provide 
shielding, since otherwise the pickup from the moving charges in the 
beam nearby would swamp the signals of the particles being detected. 
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38The large boost at LHC helps im- 
prove the proper-time resolution. 
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Fig. 10.15 The LHCb RICH detector. 
Cerenkov photons created by charged 
particles with speed 8 > 1/n are fo- 
cused onto the surface of the photon 
detectors by a series of mirrors. The ra- 
diator C4F1ọ is selected because of its 
high refractive index among the inert 
gases. From [99]. 


391n practice, we do not want to meas- 
ure the mass but simply to provide sep- 
aration between particles of different 
masses. 
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The particle identification is mainly needed for K/r separation. This is 
achieved over a wide range of momenta using two ring imaging Cerenkov 
(RICH) detectors, one of which is shown in Fig. 10.15. The Cerenkov 
effect is described in Section 4.3.4; when a particle travelling at speed 
B is sufficiently fast that 6 > 1/n, light is emitted at an angle ðc given 
by cos ðc = 1/(n8), where n is the refractive index of the medium. In a 
RICH, the locations of the Cerenkov photons are measured and used to 
fit a cone around the particle trajectory to determine 6c. In combination 
with the momentum measurement from the magnetic spectrometer, this 
can in principle determine the particle mass.°? 

The LHCb trigger consists of hardware and software levels following 
the multilevel scheme described in Section 4.10. The hardware trigger 
uses simple algorithms based on energy deposition in the calorimeters 
and hits in the muon chambers to reduce the rate to 1 MHz. All the 
data are read out at this frequency and a large computer farm performs 
more sophisticated calculations to reduce the rate to 4kHz, and the data 
that pass this higher-level trigger are retained on permanent storage for 
subsequent analysis. 


10.8 LHCb measurements 


LHCb has made time-dependent asymmetry measurements that are 
as precise as the B-factory measurements (Section 10.7.2). The decay 


Signal asymmetry 


B? + J/WKg can be reconstructed from the decays J/V > utu 
and Ks — ata. These particular final states are the simplest to 
detect in hadronic collisions.4° The decay time of the B° is recon- 
structed from the separation between the primary and secondary vertex 
(measured by the silicon detectors) and the momentum of the B°. The 
flavour of the ‘non-signal B®’ is determined by specific decays, including 
semileptonic decays to electrons or muons. The time-dependent asym- 
metry (eqn 10.41) is shown [102] in Fig. 10.16. This is the type (c) CP 
violation from Table 10.2. 

In contrast to the kaon system, CP violation due to mixing alone in 
the B system does not occur at a high level. This is the type (a) CP 
violation from Table 10.2 and is characterized by |q/p| 4 1. This can be 
measured by a time-independent semileptonic asymmetry, which implies 
that |q/p| = 1.0002 + 0.0028. 

Direct CP violation (type (b) from Table 10.2) has been observed as 
a difference in the decay rates of neutral B mesons to C’P-conjugate 
final states. For example, for the CP-conjugate decays B? > Kta~ 
and B}? + K~zt, the asymmetry parameter is measured to be 


Feo. =f 
AP Aka = Ktr 


= = —0.080 + 0.008 
DV k-r+ +I ktr- 


(10.42) 


A similar asymmetry has been measured in the decay of charged B 
mesons to the Kp? final state: Agpo = 0.37 0.1.4! This shows that CP 
effect can be measured in charged mesons and is not limited to the four 
main neutral meson systems. 

The first evidence for CP violation in B? decays [101] uses the mode 
B? + K~nx* and is an example of direct CP violation. This decay 
mode has a very small branching ratio and therefore good background 
rejection is essential. This is achieved using the excellent K/r separation 
and mass resolution. The measurement is A(B, > Krt) = 0.2740.07, 
defined similarly to eqn 10.42. 
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Fig. 10.16 Time-dependent C'P-viola- 
ting asymmetries observed by the 
LHCb experiment in the decay mode 
B? — J/WK9. From [102]. 


40A disadvantage is that the branching 
ratio of B? to this final state is only 
~ 1073. 


4l Similar asymmetry parameters for 
over two hundred final states of charged 
and neutral B mesons have been meas- 
ured and only a few of these, including 
the two mentioned, are statistically dif- 
ferent from zero. 
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Chapter summary 


Quantum mechanics allows for the possibility of quark—antiquark oscil- 
lations. 


These oscillations have been clearly observed in the K°, D°, B°, and B? 
mesons. The oscillations can be accounted for by loop Feynman diagrams 


called ‘box diagrams’. 


e CP violation has been observed in K°, B°, and B? mesons. 


e CP violation can be classified as arising from direct decays, mixing, and 
interference between decays with and without mixing. 


e CP violation in the quark sector can be accommodated in the Standard 
Model with the introduction of a complex phase in the CKM matrix. 


e Precision CP violation measurements are a very sensitive probe of new 
physics at very high mass scales. 


Further reading 


e Sozzi, M. S. (2008). Discrete Symmetries and CP vio- 
lation: From Experiment to Theory. Oxford University 
Press. 


e Gershon, T. and Nir, Y. (2012). CP violation in the 
quark sector. Phys. Rev. D, 86, 010001. 


e LHCb Collaboration and A. Bharucha, A. et al. (2013). 
Implications of LHCb measurements and future pros- 
pects. EPJ C, 73, 2373. 


Exercises 
(10.1) Starting from the definitions of p and q in 2 D pp, ia a) + (yi — yy)? 
eqns 10.23, show that Ay = |p|? — |q|’. me: : ij mm, 
j= 


(10.2) 


In the NA48 experiment shown in Fig. 10.9, 
the four photons from r°7r° decays are recon- 
structed in the liquid-krypton calorimeter, yielding 
12 quantities Ei, £i, yi for i = 1,...,4, where FE; 
is the energy of the ith photon and x; and y; are 
the positions in each of the two directions trans- 
verse to the beam. If the mass of the kaon, mx, is 
inserted as input, show that the distance in front 
of the calorimeter z along the beam direction may 
be reconstructed as 


(10.3) 


(10.4) 


Hint: First consider a 7° — yy decay with the y 
axis perpendicular to the plane of the decay, and 
discard non-leading terms in x? /z?. 

Starting from eqns 10.36 and 10.37, derive the 
CP-violating asymmetry (eqn 10.40). 

Figure 10.12 shows the processes that are import- 
ant in the study of time-dependent asymmetries 


for B + J/yKs. This technique can also be used 
(with more subtleties) for other modes. Draw a 


(10.5) 


similar set of diagrams to show how the process 
B — n'n” proceeds. Penguin diagrams (simi- 
lar to Fig. 10.11) are non-negligible in this mode; 
add an example penguin to your diagram. Another 
mode that exhibits time-dependent asymmetries is 
B — ġKs; draw the relevant set of diagrams for 
this mode. 


For the decay Bs —> a'K~, draw a tree dia- 
gram and a penguin diagram for this decay. Show 
that the tree diagram is proportional to V,4,Vua, 
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and deduce the CKM matrix elements upon which 
the penguin diagram depends. You should find 
that the flavour inside the dominant penguin is 
charm. Repeat for the bd decay B° > K*n7 
and notice that the arrangement of the lines in 
these two decays is very similar. It turns out that 
because one can show that Im(V,VuaVeoVeq) = 
—Im(Vip Vas Veo Và), the amount of direct CP vio- 
lation in each of these two decays is the same, 
which is a result that has been experimentally 
tested [101]. 
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Neutrino oscillations 


11.1 Introduction 


In the minimal Standard Model (SM), the neutrinos are assumed to 
be massless. However, as the neutrino masses are not protected by any 
gauge symmetry (unlike the photon), it is easy to extend the SM to 
accommodate neutrino masses. From an experimental perspective, it 
is much easier to detect small mass differences between different neu- 
trino mass eigenstates than to measure their absolute masses. This is 
because small mass differences combined with mixing will cause one fla- 
vour of neutrino to oscillate into other flavours, which is an effect that 
can be measured. This chapter starts with a very brief review of the 
determination of upper limits on the neutrino masses and then gives 
an explanation of the theory of neutrino flavour oscillations. We begin 
with the simple case of two-neutrino oscillations, since this brings out 
the essential features with minimal complications. We then review the 
experimental evidence for neutrino oscillations. Three-generation mix- 
ing is very interesting, since it allows for the possibility of CP violation 
in the neutrino sector, and the mathematical treatment will be given 
before showing the recent experimental evidence for a non-zero value 
of the mixing angle between first- and third-generation neutrinos. It is 
possible that CP violation in the neutrino sector can explain the ob- 
served matter—antimatter asymmetry in the Universe, and this will be 
discussed in Section 11.5.4. 


11.1.1 Neutrino masses 


The oscillation experiments measure mass differences between the dif- 
ferent types of neutrinos, but they are not sensitive to the absolute mass 
of any neutrino (see Section 11.3). In principle, neutrino masses can be 
measured from the endpoint(s) of decay spectra. For example, the 8 de- 
cay of tritium, 3H —ì He + e~ + De, has an endpoint at the Q value of 
this reaction of 18.6keV, assuming m,, = 0.! The effect of a non-zero 
value of m,, would be to create a distortion in the spectrum near the 
endpoint. Similar studies have been performed using u and 7 decays. So 
far, only upper limits have been obtained [115]. Limits on the sum of 
the neutrino masses can also be obtained from cosmological arguments, 
since finite-mass neutrinos could contribute to the matter density of the 
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Universe and affect structure formation in the early Universe. The up- 
per limits depend on which theoretical assumptions are made, but the 
current upper limit for the sum of the neutrino masses (for all gener- 
ations) is around 1eV [115]. A more direct but less precise limit on the 
mass of the electron neutrino can be derived from the observations of 
neutrinos from the supernova SN1987A (see Exercise 11.4). 

Neutrinos could be of Dirac or Majorana type (see Chapter 6). If 
they are of Majorana type, this would imply that a neutrino could be 
its own antiparticle, which would allow neutrinoless double (@ decay, 
XY — XZ +4 2e~. The Standard Model background to its detection 
would be normal double 3 decay, *Y — * Z + 2e7 + 2%. If we consider 
the combined energy of the two electrons, the Standard Model back- 
ground would produce a continuous spectrum as opposed to the peak 
at the endpoint expected for the neutrinoless double @ decay. Several 
experiments are now looking for this signal, for example the SNO+ ex- 
periment is using the isotope °Te, which is a double 8 emitter. The 
experiment uses liquid scintillator in the SNO detector (previously filled 
with heavy water for solar neutrino studies). The scintillation process 
gives a larger number of photons than from the Cerenkov process (which 
generates the photons in water). Hence the energy resolution for low- 
energy electrons is significantly better using liquid scintillator compared 
with heavy water. 


11.2 Neutrino states 


We know from studying the weak interaction that particle mass states 
need not be the same as the weak-interaction states. The two types 
of state are connected by the Cabbibo rotation matrix. A similar phe- 
nomenon occurs for neutrino states. If neutrinos have a small mass, 
the flavour (i.e. weak-interaction) eigenstates are related to the mass 
eigenstates via a CKM-like matrix: 


Ve Ver Uez Ue3 Vi 
Vu = Uui Une Una V2 (11.1) 
Vr Uzrı Uz2 Uz3 V3 


The flavour states ve, Vu, and v, propagate in space-time as linear com- 
binations of the mass eigenstates v1, v2, and v3. The mixing matrix U is 
known as the PMNS (Pontecorvo-Maki-Nakagawa-Sakata) matrix and, 
as will be discussed in Section 11.5.2, can be generalized to an arbitrary 
number N of flavours. For the general case of N > 3, the elements Uwi 
may be complex. The mixing causes transitions between different fla- 
vours such as v, and v+. As will be shown below, the flavour transition 
probabilities P(v. — vg) depend on the differences in masses between 
the mass eigenstates and are oscillatory functions of L/E, where L is 
the distance a neutrino of energy E has travelled from the source to a 
detector. 
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?This treatment follows the simpli- 
fied methods generally used in text- 
books. There are many subtleties in 
a more correct analysis, but the final 
results are unchanged (see Akhmedov 
and Smirnov in Further Reading). We 
assume that we have a superposition 
of neutrinos with the same momen- 
tum but different energies. In Exer- 
cise 11.1, we investigate the effect 
of changing this assumption. A more 
general treatment involves the consid- 
eration of wavepackets with a finite 
spread in momentum. A possible ob- 
jection to our treatment is that it ap- 
pears to violate conservation of energy 
since we have neutrino states with (very 
slightly) different energies. This issue is 
investigated in Exercise 11.2. 


11.3 Two-flavour oscillations 


Most of the original evidence for neutrino oscillations came from the 
study of muon neutrinos originating at the top of the atmosphere and 
electron neutrinos from the Sun. Since the initial neutrino flavours are 
different and the respective ranges of L/F are very different, the results 
were usually analysed assuming only two neutrino flavours. It turns out, 
as will be discussed in Section 11.5.2, that this gives a good description of 
the main features of neutrino oscillation phenomena. Since the expres- 
sions for the transition (‘oscillation’) probabilities for three (or more) 
neutrino flavours are somewhat complicated, the physics of two-flavour 
oscillations will be discussed first. This has the additional advantage that 
the underlying physics is more transparent. A discussion of three-flavour 
oscillations is given in Section 11.5.2. 

The derivation of two-flavour neutrino oscillation probabilities is 
similar to the derivation of the K°—K°® oscillation probabilities of Sec- 
tion 10.3, with the important difference that it proceeds in the laboratory 
frame, rather than the centre-of-mass. 

With only two flavours, neutrino mixing can be described by one 
(‘mixing’) angle. The two flavour eigenstates va and vg (e.g. a, 8 = e, p) 
are linear combinations of mass eigenstates vı and v2: 


\va) = cos@|r) + sin 8 |v2) 


(11.2) 
|vg) = — sin 0 |r) + cos 0 |v2) 


Consider a neutrino of flavour a with momentum p created in a weak 
interaction at t = 0. The initial state is |Y(0)} = |va(0)), which can be 
expressed in terms of the mass eigenstates? as 


\(0)) = cos 8 |r) + sin 6 |v2) 
At some time t later, the vı and vz wavefunctions have evolved: 


[w(t)) = cos 0 [vi)e E: + sin 6 [vaje -iet 


i . 11.3 
= (cos0 |) +sin0 jpp)e Ea B)) e izit ( ) 


where E and EF are the energies of the vı and v2. Now 


m2 
~p(1l+— 
ú ( = 


where, because mı and mg are undoubtedly very small, the approxima- 
tion is good for all practical values of p and p = E. Ignoring the common 


phase factor e~i”1* (which will later be cancelled by its conjugate) and 


rearranging eqn 11.3, 


[wb(t)) = cos |r) + sin 0 vaje i2 7mi)t/2E 


= cos 0 |v1) + sin 0 |v2)e iA t/2E (11.4) 


= cos 0 |v) + sin 0 |v2)e 7t’ 


The phase difference 6 = Am?t/2E between the vı and v2 components 
of |2b(t)) depends on Am? = m3—m/7, the difference between the squared 
masses of vı and v2, and t/E. 

The amplitude of |y} at t is given by eqn 11.4 in terms of |v,) and |v2). 
It can be re-expressed in terms of |vq) and |vg) by inverting eqn 11.2: 


\v1) = cos @|vq) — sin 0 |vg) 
|v2) = sin 6 |vq) + cos 0 |vg) 
and hence 
lY (t)} = (cos? 6 + sin? 0 e'®) |v.) + cos 0 sin 0 (eè? — 1)|vg) 


The probability of observing a vg at a time t is therefore 


P(a — b) = (vel)? (11.5) 
= cos? 9 sin? 6 (el? — 1)(e? — 1) (11.6) 
= cos? @ sin? 0 (2 — 2 cos ¢) (11.7) 
= 4cos? @sin? 0 sin” (56) (11.8) 
= sin?(20) sin? (Ae) (11.9) 


Likewise, the probability of observing a vg as a Va at t is 


P(a > a) = |(Val(E))|? 
Am?t 
4E ) 


= 1 — sin? (20) sin? ( 
Since neutrinos travel at speed? c, the probability that a Va of energy E 
is observed as a vg at a distance L from a source is therefore 


Am?L 
P(a > 8) = sin?(26) sin? | 2 (11.10) 
1E 
and the probability that the Va is observed as a va at L is 
Am?L 
P(a > a) = 1 — sin? (20) sin? na ) (11.11) 


Note that P(a — a) + P(a = 3) = 1, as required by unitarity.* 
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3We assume that neutrinos travel at 
speed c. If neutrinos have non-zero 
mass differences, at least one flavour of 
neutrino must travel at a lower speed 
than c and the speed of the two neu- 
trinos must be slightly different. How- 
ever, for practical purposes, the error 
introduced by this approximation is 
negligible. 


Alt is interesting to note that if the 
baseline satisfies Am?L/E >> 1, then 
the phase will oscillate very rapidly and 
(sin?(Am?L/E)) > 4. 
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>This is 15 orders of magnitude smaller 
than the pp total cross section. 


Equations 11.10 and 11.11 show that if at least one pair of neutri- 
nos have different masses and there is some mixing, i.e. 0 Æ 0, then 
transitions between neutrinos of different flavours can occur, violating 
conservation of lepton flavour number, although not of overall lepton 
number. The transition probabilities given by eqns 11.10 and 11.11 are 
simple oscillatory functions of distance and are generally described as 
‘oscillation probabilities’; P(a — £) is referred to as the vg appearance 
probability and P(a@ — a) as the vq survival probability. Before neutrino 
oscillations were discovered, there was no theoretical guidance as to the 
value of Am? and it was assumed that any mixing, i.e. 0, would be very 
small. 

Since neutrino masses are now known to be very small, Am? must 
also be small and L/E must be large for flavour-changing oscillations to 
be observable. For practical purposes, the dependence of the oscillation 
probabilities on L/E is usefully expressed as (see Exercise 11.7) 


2 
P(a > B) = sin?(26) sin? (=) (11.12) 
2 
P(a > a) = 1 — sin? (20) sin? (=) (11.13) 


where L/E is in kmGeV~! or mMeV~! and Am? is in eV. Even with 
Am? as (unrealistically) large as 1 eV’, a detector would have to be 1 km 
from a source of 1 GeV neutrinos for the phase of the energy-dependent 
terms in eqns 11.12 or 11.13 to approach 7/2 and for such an experiment 
to be sensitive to oscillations. The smallness of Am? therefore explains 
why neutrino oscillations were discovered with neutrinos from natural 
sources—cosmic rays and the Sun—rather than at particle accelerators. 


11.4 Evidence for neutrino oscillations 


Detecting neutrinos is difficult because of the very small cross sections 
involved. The neutrino—proton cross section scales with neutrino energy 
E, as o ~ G2m,E,, so, for example, o(vep) ~ 10741m? for a 10 MeV 
neutrino.” Therefore, very large active targets are required and/or very 
intense neutrino sources such as nuclear reactors. 

Increased sensitivity to small mass differences can be achieved by us- 
ing lower-energy neutrinos and making observations at greater distances 
from the source (see Section 11.3). The actual energies and distances 
used in real experiments are a compromise between these factors and the 
requirement to have a measurable reaction rate. Experiments looking for 
neutrinos produced in the Sun or by cosmic rays in the atmosphere need 
to be located deep underground to reduce the background from cos- 
mic rays. For the case of low-energy neutrinos studied in solar neutrino 
oscillations, extreme care must be taken to minimize radioactive back- 
grounds. For the higher-energy neutrinos produced by accelerators, the 
detectors can be similar to those used to study neutrino deep inelastic 
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Neutrino source Flavour FE, (MeV) L(m) (Am)? (eV?) 
Long-baseline accelerator H 108-104 106 107? 
Atmospheric e, u 10° 107 1078 
De, reactor e 1 108 1078 
Ve, Solar e 1 10! 1071 


Table 11.1 Approximate sensitivities of the different types of neutrino oscillation 
experiments. 


scattering (see Chapter 9). In the case of neutrinos from nuclear reactors 
or secondary beams from accelerators, if there are detectors at different 
distances from the source, the oscillations can be measured without need- 
ing to know the absolute neutrino flux of the source, thus eliminating a 
major source of systematic uncertainty. Alternatively, if the distance is 
fixed, one can study the change in the energy spectrum caused by oscilla- 
tions. The approximate sensitivity for neutrino mass-squared differences 
using various sources of neutrinos is given in Table 11.1. 

We will first look at the evidence for oscillations from atmospheric 
neutrinos (Section 11.4.1). The laboratory confirmation of these oscil- 
lations will be briefly reviewed in Section 11.4.2. We will then review 
the evidence for solar neutrino oscillations (Section 11.4.3) and describe 
the confirmation of these oscillations from experiments using reactor 
neutrinos (Section 11.4.5). The explanation of the solar neutrino oscil- 
lations requires enhanced oscillation probabilities when neutrinos travel 
through matter (the so-called MSW effect) and a simple explanation will 
be given in Section 11.4.4. The prospects for studying C’P violation in 
the neutrino sector will be briefly reviewed in Section 11.5.3. Finally, in 
Section 11.5.4, we will review the most exciting prospect in the area of 
neutrino oscillations, namely the idea that neutrino oscillations might 
explain the observed matter—antimatter asymmetry of the Universe. 


11.4.1 Atmospheric neutrinos 


Neutrinos are produced by cosmic rays (protons and heavier nuclei) 
colliding with air nuclei in the atmosphere. The flux of high-energy 
cosmic rays is found to scale with energy E proportionally to E727. 
However, below about 20 GeV, the primary cosmic rays are strongly af- 

fected by the Earth’s magnetic field. Even allowing for the linear increase 

in neutrino cross section with energy, the neutrinos that interact will 

be predominantly of low energy; the event rate peaks at about 1 GeV. 

A large background is that from other cosmic rays that interact electro- 
magnetically or strongly and can overwhelm the neutrino signal. This 
background is greatly suppressed by operating the detectors deep under- 6 Typical depths are greater than about 
ground. The primary cosmic rays interact with nitrogen and oxygen 1km. 
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"The atmosphere is approximately 11 
nuclear interaction lengths deep, so the 
probability of a primary cosmic ray 
not interacting in the atmosphere is 
negligible. 


This detector and others like Soudan 
were originally designed to search for 
proton decay, with neutrino inter- 
actions being regarded as a source of 
background. 


9If a muon stops inside the detector, 
the subsequent decay will produce a 
delayed electron that produces a dis- 
placed ring. This provides an additional 
identification power for muons. 


Fig. 11.1 Super-Kamiokande event 
displays [129] for (a) a muon neutrino 
event and (b) an electron neutrino 
event. 


in the atmosphere,’ producing charged pions, which decay in turn to 
muon and electron neutrinos according to the decay chain for 7? (with 
a similar decay chain for 77): 


nt — pt + vy 
ut set +o, + ve 


Thus, we naively (see Exercise 11.3) expect two muon-type neutrinos 
for each 7+ but only one electron-type neutrino. The best data come 
from the Super-Kamiokande experiment, which uses a 50000 ton wa- 
ter Cerenkov detector to detect the electrons and muons produced by 
the neutrino interactions in the water.8 The Cerenkov light produced 
by a charged particle travelling with a speed greater than the speed of 
light in water is emitted in a ring around the direction of the particle’s 
motion (see Chapter 4). Electron neutrinos produce electrons, which 
in turn produce electromagnetic showers in the water. Therefore, elec- 
tron neutrinos tend to produce ‘fuzzier’ rings than muon neutrinos and 
this can be used to provide a powerful statistical separation between 
the two flavours of neutrinos. The different responses of the Super- 
Kamiokande detector to electrons and muons are illustrated in the event 
displays [129] in Fig. 11.1.9 Although there are 20% uncertainties in the 
absolute neutrino fluxes, some of the uncertainties cancel in the ratio 
R = Flux(y,,)/Flux(v.). The data are usually presented in terms of the 
double ratio R’ = Rexperiment/Rpredictea, Where the prediction assumes 
no neutrino oscillations. The results from the Super-Kamiokande experi- 
ment and other experiments [115] are summarized in Table 11.2. The 
values of R’ are significantly lower than unity, which would be expected 
in the absence of neutrino oscillations. 

Even more direct evidence for neutrino oscillations comes from the 
zenith-angle distribution. The zenith angle 0z is the angle between the 
vertical and the direction of the incoming neutrino. So values of cos 0z 
close to 1 correspond to distances travelled by the neutrinos of ~10 km, 
whereas those with negative values of cos0@z have travelled for dis- 
tances of ~10000km. The distributions from Super-Kamiokande [115] 
are shown in Fig. 11.2. 


(a) 


Experiments Double ratio R’ 
Kamiokande-s 0.60+9-% + 0.05 
Kamiokande-m 0.57+9:03 + 0.07 
Soudan-2 (iron) 0.66 + 0.11 0.06 
Super-Kamiokande-s 0.64 + 0.02 + 0.05 
Super-Kamiokande-m 0.68 + 0.03 + 0.05 


Table 11.2 Measurement of atmospheric neutrino ratio R’. Exposure is in units of 
kton-years and the suffixes ‘s’ and ‘m’ stand for sub-GeV and multi-GeV, respectively. 
The errors quoted are the statistical and systematic errors. 
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Fig. 11.2 Zenith-angle distributions 
for the electron-like (a, c, e) and muon- 
like (b, d, f) events [115]. The events 
are classified in three energy ranges: 
less than 400MeV (a, b), less than 
1GeV (c, d), and greater than 1 GeV 
(e, f). The dotted histograms show the 
expectations in the absence of neutrino 


oscillations and the solid histogram 
is the result of the fit for vy, > vr 
oscillations. 
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l0We cannot focus neutrinos, but we 
can produce a somewhat collimated 
neutrino beam by focusing the charged 
pions before they decay. The focusing 
uses toroidal fields that vary as 1/r 
(where r is the radial distance from 
the beam axis). This ensures that par- 
ticles emerging at large angles to the 
beam axis remain in the field region 
for longer, which in turn ensures that 
charged particles within a certain angu- 
lar range emerge parallel to the beam 
axis. This is achieved using ‘magnetic 
horns’ that require currents of ~100 kA. 
This is too large for DC currents, so 
pulsed currents are used (synchronized 
to the proton bunches). 


Fig. 11.3 Schematic view of the struc- 
ture used to create intense neutrino 
beams. 


These distributions show good agreement with the no-oscillation cal- 
culations for the electron neutrinos and for the muon neutrinos at large 
positive values of cos 0z. However, the muon neutrinos show a large 
deficit at negative values of cos@z. This is exactly what would be ex- 
pected for muon neutrino oscillations, because the events at negative 
values of cos@z correspond to neutrinos that have travelled large dis- 
tances through the Earth (~10000km), whereas the events at positive 
cos @z, correspond to neutrinos that have only travelled ~20km. The 
data can be explained by neutrino oscillations of the type v, — v, with 
sin? 20 ~ 1 and Am? ~ 0.003 eV?. 


11.4.2 Laboratory confirmation of atmospheric 
neutrino oscillation 


We start this section with the description of how a neutrino beam is 
produced from an accelerator. Then follows a discussion of results from 
the MINOS experiment. 


The NuMI beam at Fermilab 


The creation’? of a high-intensity, high-energy, laboratory beam of neu- 
trinos (and antineutrinos) starts by extracting the primary 120 GeV 
proton beam from the Tevatron (2.5 x 10° protons per pulse) and dir- 
ecting it at a long thin target (see Fig. 11.3). This produces a large flux 
of charged pions and kaons, with energies in the range 2-60 GeV, which 
is then focused and collimated, before entering a 675m evacuated pipe 
in which the rs and Ks decay to uv, (see Exercise 11.8). The decay pipe 
is aimed at the MINOS far detector. The next stage is to absorb any 
remaining hadrons and the large flux of muons. The hadron absorber, 
consisting mainly of steel, is placed immediately after the decay pipe. 
Finally, muons must be removed before the NuMI beam passes through 
the MINOS near detector in a cavern 240 m beyond the hadron absorber. 
The intervening rock is dolomite and of sufficient density to absorb the 
muon flux within that distance. 

By the time the neutrinos have reached a distant detector, the beam 
size will have become too large for a detector to contain the beam. 
Therefore, the detector volume is made as large as can be afforded to 


Magnetic horn 


Absorber 
p beam 
Target Decay volume 
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compensate for the small neutrino cross sections. In practice, the op- 
timal detector shape tends to be approximately cylindrical, with the 
length along the beam direction being much greater than the width. As 
very large detectors are required, relatively cheap and simple detector 
technologies must be employed (see Chapter 4 for a discussion of generic 
neutrino detectors). 


MINOS 


For example, the MINOS" experiment [109] uses magnetized iron as 
the target. with scintillator layers interspersed with the iron plates. The 
MINOS detector has two parts: the far detector located at a distance of 
735 km from Fermilab and a near detector at Fermilab. The far detector 
consists of alternating plates of iron and scintillator, with a total mass of 
5400 tons. 1? The iron acts as the passive absorber for the calorimeter, but 
in addition it is magnetized. The scintillator is segmented into narrow 
strips, which are read out by wavelength-shifting fibres.'? These fibres 
are coupled to photomultiplier tubes via additional clear fibres. The 
scintillator signals are used for calorimeter measurements as well as for 
muon tracking. The muons are identified because they penetrate much 
deeper into the detector. Their momentum can be determined from track 
curvature in the magnetic field, but if they are contained in the detector, 
a more precise determination can be made from their range. 

The detector is located 716m underground (in the Soudan mine) in 
order to minimize backgrounds from cosmic rays. To further reduce back- 
grounds, veto detectors are located around the main detector. Therefore, 
the signal for charged-current interactions of v, is the presence of a high- 
momentum muon. The hadronic energy can be measured by the total 
energy deposited in the scintillators. A critical feature of the experiment 
is that there is a near detector close to the origin of the neutrino beam 
as well as a far detector at a distance of 735km. The near detector 
measures the neutrino flux at the origin and can therefore be used to 
predict the flux and energy spectrum of neutrinos at the far detector in 
the absence of oscillations. It was the difference between the measured 
and predicted rates in the far detector that confirmed [10] the presence 
of muon neutrino oscillations (see Fig. 11.4). 


11.4.3 Solar neutrinos 


The Sun generates energy by nuclear fusion reactions. The most 
important reactions are those of the pp cycle: 


pt+p—*H+y.e+et +0.42 MeV 


2H +p —> °He +y +5.49 MeV (11.14) 


3He + ĉ°He > “He + 2p +12.86 MeV 
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ll Main Injector Neutrino Oscillation 
Search. 


12 Unless otherwise stated, ‘MINOS de- 
tector’ refers to the far detector. 


13 The wavelength-shifting fibres are 
doped with a suitable chemical to 
convert shorter-wavelength photons to 
longer, which increases the absorption 
length. The photons trapped inside the 
fibre can easily be transported to the 
photomultipliers using total internal re- 
flection (as in normal optical fibres). 
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Fig. 11.4 Evidence for v, disappear- 
ance from the MINOS experiment. 
(a) Energy spectrum of the neutrinos 
as reconstructed in the far detector. 
The data are compared with the no- 
oscillation hypothesis and with an os- 
cillation fit. (b) Ratio of the observed 
spectrum to that expected from the 
no-oscillation hypothesis [10]. 


MThe low-energy neutrinos from the pp 
cycle have been measured using Ga de- 
tectors; however, the main evidence for 
neutrino oscillations relies on measure- 
ments of the higher-energy neutrinos. 


5 CoClu (perchloroethylene) is often 
used in dry cleaning. 
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Other branches of the cycle can produce higher-energy neutrinos, which 
are easier to detect; for example, we can have 


3He + tHe = "Be + y +1.59 MeV 

TBe + p — 8B +0.14MeV 
(11.15) 

8B > Be +e +ve +14.6 MeV 

8Be > 2 4He +3 MeV 


where the endpoint of the resulting neutrino spectrum is at 14.6 MeV. 
The Standard Solar Model (SSM) prediction [42] for the solar neutrino 
spectrum is shown in Fig. 11.5. Note that the bulk of the spectrum is 
due to the pp cycle and is therefore at very low energy, which makes the 
neutrino detection much more difficult. 14 

The first technique used to detect solar neutrinos was based on the 
reaction ve + 37C1 — 37Ar +e. The chlorine was contained in a tank of 
630 tons of C2Cl4,15 from which the argon was extracted periodically. 
The 37Ar decays by electron capture (the inverse reaction to that which 
produced it) and is left in an excited state. The atomic electron capture 
leaves a ‘hole’ in a low-energy state, and this will be filled by an electron 
from a higher energy level. The energy released can result in an outer 
electron being ejected from the atom (the ‘Auger’ effect). The resulting 
Auger electrons were detected in a proportional counter (see Chapter 4). 
This experiment was started over 40 years ago by Ray Davies and was 
the first to detect solar neutrinos and show that the rate was lower than 
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expected. Davies shared the 2002 Physics Nobel Prize with Masatoshi 
Koshiba (Kamiokande) and Riccardo Giacconi.'® 

Several experiments have seen a clear deficit of electron neutrinos 
compared with the predictions of the SSM. It is very difficult to recon- 
cile these data with modifications of the SSM, whereas they could all 
be explained by electron neutrinos oscillating into other flavours [115]. 
However, the most model-independent demonstration that the neutrino 
deficit is due to oscillations rather than problems with the SSM comes 
from the SNO experiment [12]. The SNO experiment used 1000 tons 
of very pure D20 viewed by 10000 large photomultipliers. The pri- 
mary reactions are the charged-current (CC) interactions of the electron 
neutrinos, 


Ve+D—p+pte (11.16) 
the neutral-current (NC) interactions of all neutrino flavours, 

bo tD —>p+Hnt iy (11.17) 
and the elastic scattering (ES) reaction 

Ve +E > Vr +e (11.18) 


The Feynman diagrams for the three processes are shown in Fig. 11.6. 
The CC interactions are sensitive only to electron neutrinos because the 
thresholds of the equivalent reactions with other neutrino flavours are 
too high. The NC interactions are equally sensitive to all neutrino 
flavours. All three neutrino flavours contribute to the ES reactions, 
but the cross section is much larger (~6 times) for electron neutrinos. 
Therefore, by measuring the rate for CC, NC, and ES interactions, i.e. 
enough information to deduce the total ve +v, + vr rate, one can look 
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Fig. 11.5 Predicted neutrino energy 


spectrum in the SSM [115]. 


16The former director of the Space 


Telescope Science Institute. 
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Fig. 11.6 Feynman diagrams at the 
quark level for charged-current (a), 


neutral-current (b), and elastic 
scattering (c) reactions. 
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Fig. 11.7 Fluxes of 8B solar neutri- 
nos, deduced from the SNO charged- 
current, neutral-current, and elastic 
scattering results from the salt-phase 
measurement and the results from 
Super-Kamiokande. The vertical axis is 
the flux of muon and tau neutrinos ¢y7 
and the horizontal axis is the flux of 
electron neutrinos ġe. The expectations 
from the SSM are shown by dashed 
lines [115]. 


for evidence of neutrino flavour transitions, independent of the SSM. 
Unambiguous evidence for neutrino flavour transitions is then provided 
by the measured flux of all neutrino flavours (as determined by the NC 
interactions) and the flux of electron neutrinos as determined from the 
CC interactions. The signals for the CC and ES interactions come from 
the produced electrons generating electromagnetic showers, which then 
lead to Cerenkov radiation, which is detected in the photomultipliers. 
The NC interactions are more difficult to detect. 

In the first phase of SNO, neutrinos were detected by neutron capture 
on deuterium, which results in 6.25 MeV photons, which produce elec- 
tromagnetic showers that then lead to Cerenkov radiation. In order to 
increase the sensitivity of the experiment to NC interactions, in the sec- 
ond phase of SNO operation (SNO,), 2 tons of NaCl were added to the 
D20 because this enabled neutron capture on ?°Cl. The ?°Cl has a high 
absorption cross section for low-energy neutrons, and neutron capture 
leads to photons with an energy distribution peaked around 8 MeV. The 
measured fluxes from SNO (in units of 10°cm~*s~') are [12] 


$cc = 1.68 + 0.06 (stat.) +308 (syst.) 
ong = 2.35 + 0.22 (stat.) + 0. 
dnc = 4.94 + 0.21 (stat.) 19:38 (syst.) 


(11.19) 


This combination of measurements gives the determination of the flux 
of muon and tau neutrinos (in units of 10°cm~?s~*) as 


(Vy) + O(v;) = 3.26 + 0.25 (stat.) +938 (syst.) 


(11.20) 


This observation of a non-zero flux of v, or v, from the Sun there- 
fore provides model-independent evidence of solar neutrino oscillations. 
The SNO results are combined with those from Super-Kamiokande [115] 
and are shown in Fig. 11.7. The data are clearly inconsistent with 
the no-oscillation hypothesis (¢(v,) + ¢(v;) = 0). All the data are 
consistent with each other and with the predictions of the SSM. Fi- 
nally, the combined results from all phases of the SNO experiment can 
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also be used to measure the total flux of ŝB neutrinos!’ from the Sun 
ESRO = 5.25 +0.16 (stat.) 19:11 (syst.), which is in good agreement with 
the SSM prediction of ssm = 5.9412). This confirms that the Sun is 
producing energy from nuclear fusion at the rate expected in the SSM. 

Although the SNO results confirm the hypothesis that neutrino oscil- 
lations occur, in the context of the vacuum oscillation model we have 
considered it is difficult to understand why the ratio of electron neutrino 
to all flavours of neutrinos Ve/Vx is < 5. If the oscillations are occur- 


ring rapidly compared with the Sun—Earth distance, then the minimum 
1 


value of the ratio would be 5. This statement is strictly only true in 
a two-flavour model, but this is a good approximation for solar neutri- 
nos. Before considering the probable explanation for this, we first need 
to consider other solar neutrino experiments that are sensitive to lower- 
energy neutrinos. Most of the total flux of neutrinos comes from the first 
reaction in the pp chain (see eqn. 11.14), which generates neutrinos with 
a continuous spectrum with an endpoint at 0.42 MeV (see Fig. 11.5). 
The flux of these low-energy neutrinos is very tightly constrained by 
the observed solar luminosity, unlike the flux of higher-energy neutri- 
nos, which are more sensitive to details of the solar model (in particular 
the core temperature). Therefore, the original motivation for efforts to 
detect these low-energy neutrinos was to see if the solar neutrino deficit 
was due to the solar model or to neutrino oscillations. However, from 
our perspective, these data are particularly interesting when compared 
with the higher-energy data, because they reveal a significant energy 
dependence in the oscillation probability. 

The detection of very low-energy neutrinos (Æ, < 0.42 MeV) is chal- 
lenging. The radioactive backgrounds in water Cerenkov detectors are 
too large to allow a sufficiently low threshold to be set. Such low-energy 
neutrinos have been studied by radiochemical experiments using the 
reaction ve + "Ga 7l1Ge + e7, which has a threshold energy of 
233 keV. Slightly over half of the interactions are expected to come from 
the low-energy pp neutrinos. In the GALLEX experiment, the produced 
Ge was periodically extracted (about every 10 days) chemically and the 
number of 7!Ge atoms was detected using electron capture (essentially 
the inverse of the reaction that created the Ge). The electron capture 
leaves a hole in the K or L shell, which is filled by electrons from 
higher energy states, resulting in X-ray emission. The X-rays are 
detected in proportional counters filled with xenon gas (xenon is a good 
absorber for X-rays because of its large Z value).!® The results from 
the different gallium experiments are summarized in [115] and show a 
suppression compared with the prediction of the SSM of a factor of 
about H, Comparing these results for low-energy neutrinos with those 
from SNO and Super-Kamiokande, which are sensitive to higher-energy 
neutrinos, we can see that the neutrino oscillation probability has a 
significant energy dependence. These observations are hard to reconcile 
with vacuum oscillations, so we need to consider the effect of matter on 
neutrino oscillations, which we will do in the next section. 


17The threshold used by SNO to re- 
duce the backgrounds meant that the 
detector was mainly sensitive to 8B 
neutrinos. 


18 The target contains 101 tons of 
GaCl3 and a typical 10-day run pro- 
duces about ten “!Ge atoms. Therefore, 
the detector has to be deep under- 
ground to be shielded from cosmic rays, 
and extreme care must be taken to 
minimize radioactive backgrounds. 
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19 Here we give a simplified description 
of the MSW effect in order to illus- 
trate how it can resolve this apparent 
paradox. 


20We are working in the flavour basis 
for the neutrino eigenstates. 


21 The effect on neutrino propagation is 
similar to a change in refractive index 
in the optical case. This then changes 
the propagation speed for electron neu- 
trinos and hence the oscillation rate. 


11.4.4 MSW effect 


There is very strong evidence for neutrino oscillations, as we have seen 
in Sections 11.4.1—-11.4.3 (see also Section 11.4.5). The atmospheric neu- 
trino data can be explained by the phenomenology of vacuum oscillations 
(see Section 11.3) with a close to maximal mixing angle between v, 
and v+. This implies that we can consider the electron neutrino mixing 
in the simple two-component picture (the ve mixes with a linear super- 
position of v, and vr). Therefore, the probability of a Ve created in the 
Sun remaining as a Ve is given by 


(11.21) 


Am?t 
Py, +, (t) = 1 — sin? 20 sin? ( = ) 


4E 


For Am? > 107*eV?, the phase factor oscillates so rapidly that it 
will average to a value of $, and therefore the minimum value of 
P,.—v, (t) would be 4. However, the SNO and other data clearly indicate 
a significantly lower value (see Section 11.4.3). 

This apparent paradox can be understood if we allow for the effects of 
matter on the propagation of neutrinos through the Sun, the so-called 
Mikheyev-Smirnov-Wolfenstein (MSW) effect!® [108, 135]. The effect- 
ive Hamiltonian differs from its vacuum counterpart by the addition 
of weak interactions. All flavours of neutrinos can have neutral-current 
interactions, but only (low-energy) electron neutrinos can undergo 
charged-current interactions. From the perspective of oscillations, we 
are only interested in the differences between electron neutrinos and 
muon or tau neutrinos, and therefore we do not need to consider the 
neutral-current interactions. The probability of incoherent scattering in 
the Sun is negligible and in any case could not contribute to interference 
effects. We are therefore only interested in the charged-current coherent 
forward scattering, which gives a contribution to the Hamiltonian for 
electron neutrinos?° of 


H(r) = V2 Gf N.(r) (11.22) 


where Gp is the Fermi coupling constant and Ne(r) is the electron 
volume number density at a radius r in the Sun.?! 

To understand the MSW effect, we start by considering the 
Schrödinger equation for two neutrino flavours propagating in vacuum. 
For oscillations, we are only interested in the terms in the Hamiltonian 
that are different for electron neutrinos compared with other flavours of 
neutrinos. We can write this part of the Hamiltonian as 


Am? /—cos20 sin20 
AS 4p ( sin 20 cos 20 (41:23) 
and the time-independent Schrödinger equation as 
Ve\ _ Ve 
Ay C) =E e (11.24) 
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where vg represents the non-electron neutrino flavours (i.e. those that do 
not interact with electrons by charged-current interactions). We can then 
show that the difference in the energies of the two eigenstates is given by 
AE = Am? /2p (see Exercise 11.5). The effect of the charged-current co- 
herent forward scattering is to change the effective potential for electron 
neutrinos by Ve = V2GrNe, where Ne is the electron number density. 
We can evaluate the effect on the Hamiltonian using E? — p? = m? and 
assuming that neutrinos are ultrarelativistic and V.« E: 


m2 = (E + Ve)? — p? = m? + 2EV. (11.25) 
Therefore, the change in m? for the electron neutrino is given by 
Am?, = 2V2 GF N-E (11.26) 


Again assuming that the neutrinos are ultrarelativistic, we can write the 
contribution from matter to the Hamiltonian as 


1 0 
AHm = V2 GFN: G s (11.27) 
As usual, we can drop any term proportional to the unit matrix from 
the point of view of oscillations. It is therefore convenient to rewrite this 
as a term proportional to the unit matrix and the matrix relevant for 
oscillations: 


AHy = V2 Gr Ne/2 ({ a) (11.28) 


We can then combine the vacuum term, eqn 11.23, with the matter term, 
eqn 11.28, to obtain the Hamiltonian in the presence of matter: 


Am? (—cos20 sin20 V2GrNe. 1 0 
Hm = 4p e i e 2 (11.29) 


It is conventional to define A = 2V2 Gr Nep/Am?, simplifying eqn 11.29 
to give 


_ Am? (C cos 20 + A sin 20 ) 


Hm = 4p sin 20 cos 20 — A (11.30) 


We can then define an effective mixing angle in the presence of mat- 
ter as 0m and define the total effective Hamiltonian in the presence of 
matter as 


Am? /—cos20m sin 26m 
Hu = 4p ( sin20,,  cos2ôm (11.31) 


Comparing eqn 11.30 with eqn 11.31, we can relate the mixing angle in 
the presence of matter to that in vacuum: 


tan 20 


vant 1 — Asec 20 


(11.32) 
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22solar neutrinos are made by nuclear 
fusion in the core of the sun. As the core 
of the sun is much smaller than the full 
volume, they will be created near the 
centre. 


This clearly has a resonant condition if A > 0 and A = cos 20, or, sub- 
stituting for the definition of A, we find the electron neutrino resonant 
energy at which the mixing becomes maximal 


Am? cos 20 


E es — 
Re 5 Gane 


(11.33) 


We can now use this MSW formalism to explain how the measured 
suppression of electron neutrinos can be energy-dependent and how the 
suppression of high-energy electron neutrinos can be larger than a factor 
of two, which would be the maximal value (assuming that the electron 
neutrinos make many oscillations between the Sun and the Earth). 

Electron neutrinos with energy E > 2 MeV at the centre of the Sun?? 
will have a higher energy than Epes (see eqn 11.33 and Exercise 11.6). 
Since the density of the Sun increases towards the centre (owing to 
gravity), the electron density decreases smoothly as the neutrinos leave; 
therefore, these electron neutrinos will hit the resonance condition. As 
the density changes slowly with radius, this will happen ‘adiabatically’ 
and the neutrinos that will propagate out to the surface of the Sun will 
be the heavier-mass eigenstate v2: 


V2 = Ve Sin Ô + Vp cos 0 (11.34) 


where @ is the vacuum mixing angle. Since these states are now eigen- 
states of Hy, they will simply propagate to the Earth with no further 
oscillations. From eqn 11.34, we can then easily see that the ve survival 
probability is given by (eqn 11.11) 


P(ve > Ve) = sin? 0 (11.35) 


and can therefore be less than ZŁ. Electron neutrinos with lower ener- 
gies will not see this resonance and will effectively propagate as vacuum 
states. As these neutrinos are oscillating rapidly compared with the tran- 
sit time from the Sun to the Earth, the average phase factor will be z, 


so the survival probability will be given by 
1 
P(ve > ve) = 1 — 5 sin” 20 (11.36) 


The combined results of the solar neutrino experiments can be fitted in 
terms of the MSW effect (see Section 11.4.4), with Am? ~ 5 x 1075 eV? 
and tan? 0 ~ 0.45 [115]. 


11.4.5 Confirmation of solar neutrino oscillations 


Confirmation that neutrino oscillation was the correct solution to the 
solar neutrino puzzle came from the KamLAND experiment. The de- 
tector is based deep underground in the Kamiokande site in Japan. 
It detects electron antineutrinos from a large number of Japanese nu- 
clear reactors. The flux-weighted average distance from the sources to 
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the detector is about Lo = 180km. This long baseline provides sen- 
sitivity to small values of the mass splitting. The primary reaction is 
De + p — etn. The detector consists of 1000 tons of liquid scintillator, 
surrounded by photomultipliers. The advantage of liquid scintillator over 
water (Čerenkov) is that it produces a larger signal, thus enabling it to 
achieve lower energy thresholds. This is important because the neutrino 
energies are below 8 MeV. However, extreme care must still be taken to 
minimize radioactive isotopes in the detector. The primary signal comes 
from the et but there is also a delayed signal from y rays after the neu- 
trons are captured on protons. This helps reduce the background and 
enables the threshold to be lowered to 2.6 MeV [77].78 

The KamLAND experiment measured the De spectrum and, by divid- 
ing by the expected flux from the no-oscillation model, it was possible 
to measure the survival probability as a function of Lo/E (where E is 
the neutrino energy). The results in Fig. 11.8 show clear evidence for 
neutrino oscillations and, even more interestingly, these data show the 
characteristic maxima and minima expected from oscillations. 

From a combined fit to the solar and KamLAND data, the solution 
to the solar neutrino problem requires the MSW effect. 


11.5 Three (or more)-flavour oscillations 


11.5.1 Generalized oscillation probabilities 


Expressions for the oscillation probabilities for the general case of N 
flavours and N massive neutrinos can be derived in the same way as 
for two flavours. The N flavour eigenstates |v.) are expressed as linear 
combinations of the N mass eigenstates |v;,): 


Vo) = 5 Uak|Vk) 
k 


Fig. 11.8 Ratio of the measured neu- 
trino rate divided by that expected in 
the no-oscillation model, as a function 
of Lo/Ep,. The lines are best fits for 
neutrino oscillation models [77]. 


23 There is also a background from ‘geo- 
thermal’ neutrinos, i.e. neutrinos from 
the decays of radioactive isotopes in the 
Earth. In future, the study of geother- 
mal neutrinos might become interesting 
for geophysics research for distinguish- 
ing between different models of the 
Earth’s core. 
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24.45 the matrix U is unitary, the in- 
verse is UT! = Ut. 


where the coefficients Uag are the elements of an N x N PMNS matrix. 
In general, the Ua, are complex, and unitarity requires that 


> Vora; = Six 
a 
XO VorU gn = bap 
k 
If a neutrino of flavour a is created at t = 0, the initial state is 


(b(0)) = iD Uak|vk) 
k 


At some later time t, the phases of the mass eigenstates have evolved 
and 


WWE) = X Vane t ve) 
k 


where Ey, is the energy of the vg. The mass eigenstates can be expressed 
in terms of the flavour states using?* 


Iva) = D> Ulva) 
sO 
UORD (= vue Ivo) 
B k 


The amplitude for observing a vg at time t is 
(valw(t)) = X UaU zpet 
k 


Then, as for the two-flavour case, taking Ek ~ p+ mz /2p and p= E, 


(valw(t)) = D> (Van ge mtt/?®) ei 
k 


and the probability of observing a neutrino of flavour ĝ at a distance L 
from the source is 


P(va > vg) = | (voly = L))|? 


er Amz,L 
= X UaU} Už;jUp; exp | —i T (11.37) 
kj 


2 — m2 2 
where Amg; = mz, — m}. 
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The neutrino oscillation probability, eqn. 11.37, can also be written as 


; Amel 
P(Vva > vg) = bag — 4) Re(VauUix¥asUei) sin? ( al 
; (11.38) 


. [Am?,L 
| Ta ) 
k>j 


For antineutrinos, the elements of the PMNS matrix Uag must be 
replaced by their complex conjugates Užą and 


|[Da) = EkUdk| Pe) 


The antineutrino oscillation probabilities become 


Am? L 
P(2a > Dg) = bap — 4X Re(UakU žr Už; Ug;) sin? ( ue 
k>j 


Am? L 
-2X Im(UarU jU unsa ( sat) 


k>j 


The third terms of eqns 11.38 and 11.39 have different signs, with 
the important consequence that for the same L/E the neutrino and 
antineutrino oscillation probabilities are different. Since neutrinos and 
antineutrinos are CP conjugates, this immediately suggests the possi- 
bility of CP violation and, by the CPT theorem, T violation. The CP 
asymmetry would be 


Aza (L, E) = P(vq > vg) — P(Da > Do) 


Am? L 
=4Ņ_ Im(UakU}, Uš uasa ( at) (11.40) 


k>j 


CP violation would be manifested as a difference between the va — vg 
and Da — vg oscillation probabilities, and T violation by a difference in 
the time-reversed probabilities P(v, — vg) and P(vg > Va). For CP or 
T violation to occur, the imaginary parts of eqn 11.40 must not vanish; 
in other words, the PMNS matrix elements Ug, must be complex.?° CP 
violation would not occur for only two neutrino flavours since, just as 
in the case of two quark generations, the mixing can be described by a 
single angle. 

The transition Pg — Da is the CPT conjugate of va — vg and, by 
the CPT theorem, P(¥jg — Da) = P(va — vg). Therefore, the CP- 
conjugate survival probabilities P(i, > Da) and P(va — Va) must be 
equal, and CP violation can only be observed by comparing appearance 
(i.e. flavour-changing) probabilities. 

The rather formidable expressions for the oscillation probabilities in 
eqns 11.38 and 11.39 simplify in certain cases. For example, when one of 


©The quantity Im(UanU3,Uz;U,;) is 
directly analogous to the invariant area 
J of the CKM unitarity triangles, 
eqn 10.28 in Section 10.6, which de- 
scribes CP violation in the quark sec- 
tor. 
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6 This representation of the PMNS 
matrix is identical with the represen- 
tation of the CKM matrix of quark 
mixing discussed in Section 10.6. Of 
course the angles have entirely different 
physical meanings. 


the Amz, is much greater than all other Am?s (‘one-mass-scale domin- 
ance’), they reduce to quasi-two-flavour expressions described by a single 
effective mixing angle. 


11.5.2 Three-flavour oscillations 


There is currently no convincing evidence for more than three neutrino 
flavours. The mixing of the known neutrinos can therefore be described 
in terms of three mixing angles 612, 013, and 023, a single phase ô, and the 
three mass (squared) differences, Am?,, Am33, and Am?, between three 
mass eigenstates 1, v2, and v3. It is known that Am? ~ Am3, > Am?, 
since solar and atmospheric neutrino oscillations are described by two 
very different mass differences. From the solar neutrino data allowing for 
the MSW effect (see Section 11.4.4), we can determine that Am7, > 0. 
However, we do not currently know the hierarchy of the masses mg and 
m3; i.e. is Am3 < 0? 
The three-flavour PMNS matrix (eqn 11.1) 


Uei Ue2 Ue3 
U= [Uw Upa Ups (11.41) 
Ur U72 U;3 
can be expressed in terms of the mixing angles as 


id 


€12C13 C13512 S13€ 
ið id 
U = | —c23812 — €12813823¢" €12€23 — $12813823e" C13823 
id ið 
523512 — C12C23813€" —€12823 — €23812813€" €13C23 
(11.42) 
where cj; = cos6;; and sij = sin@;;. The mixing angles 6;; can be 


thought of as representing rotations around a third, k, axis. For example, 
$13 represents a rotation around the ‘2’ axis.2© The matrix can also be 
written as follows to make the three separate rotations more apparent: 


1 0 0 C13 0 $13e79 C12 sig 0 

U = 0 C23 $23 0 il 0 =$] C12 0 

0 — $8923 C23 —sy3e 0 C13 0 0 1 
(11.43) 


It can be seen from eqn 11.42 that the phase 6 must be non-zero for CP 
violation to occur. It can also be shown that all the angles, including 
#13, must be non-zero for CP violation to occur. 

The three-flavour oscillation probabilities for neutrinos and antineut- 
rinos can be written in terms of the mixing angles by substituting the 
Uai of eqn 11.42 into eqn 11.38 or 11.39, respectively, for N = 3. Since 
the resulting expressions are rather long, they will not be given here. An 
important result, however, is that the dependence of the oscillation prob- 
abilities on L/E falls into three different regimes. If Am? L/E <1, no 
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flavour oscillations will occur. In the second regime, where Am3,L/E ~ 1 
but Am?,L/E « 1, the oscillation probability P(va — vg) depends 
mainly on the four elements of U that couple vg and vg to vz and v3, 
ie. Ua2, Ug, Ua3, and Ugs. Finally, if Am?,L/E > 1, the oscillation 
probabilities depend on all the Uai. For example, if Ue3 = 0 (as is ap- 
proximately true), Ve does not couple to v3 and ve 4+ Vu, and Ve 4 Vr 
oscillations will not occur unless Am?,L/E > 1, although v, > v- 
oscillations can take place at smaller L/E as long as Am3,L/E ~ 1. 
The two regimes Am?,L/E « 1 < Am3,L/E and 1 < Am?,L/E are 
often referred to as the ‘atmospheric’ and ‘solar’ oscillation regimes, 
respectively. 

When the results of all neutrino oscillation experiments and measure- 
ments are combined, it is found that the mixing is described by 


Ve 0.82 0.56 ~0.15\ /r 
v, | = | 0.31-0.43 0.51-0.59 0.75 | |r (11.44) 
v, 0.37-0.47 0.59-0.66 0.66 J \v3 


and Am3, =~ Am?, = 2.3 x 1073 eV? and Am7, = 7.5 x 1075 eV?. 

At present there is no experimental evidence that the phase 6 is non- 

zero, and the values in eqn 11.44 are the magnitudes of Uai. It can 

be seen that since |Ue3| = sin ĝ13 ~ 0.15, Ve is coupled only weakly 

to v3; the v3 ‘content’ of ve is |U2,| œ~ 0.02, i.e. only 2%. From the 

discussion in the preceding paragraph, this means that ve barely partici- 

pates in oscillations involving v2 and v3 only, i.e. where Am3,L/E ~ 1 

but Am?,L/E « 1, and justifies the analysis of the results of atmos- 

pheric neutrino and long-baseline accelerator experiments in terms of 

two-flavour oscillations characterized by Am2, and 623. It also means 

that the disappearance of solar ve can be described in terms of two 

flavours characterized by Am?, and 6)2.?" The oscillation regimes are?" This can relatively easily be seen by 

illustrated in Fig. 11.9. setting 13 = 0 and c13 = 1 in eqn 11.42 
In stark contrast to the CKM matrix, which is approximately diag- cia sete: thote sukvival probability 

onal, all the elements of U, with the exception of Ue3 (equivalently 013), oe 

are large. Their approximate equality suggests that CP violation by neu- 

trinos could be large for the same reason that CP violation by mesons 

depends on the size of the CKM matrix elements via the angles of the 

unitarity triangles as discussed in Section 10.6. 


No oscillations , ‘Atmospheric’ » ‘Solar’ 


Fig. 11.9 The behaviour of flavour- 
changing neutrino oscillations as a 
function of L/E. In the atmospheric 
regime, the transitions are predomin- 
antly vy, + v7; in the solar regime, all 
flavours participate. 
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Fig. 11.10 The energy spectrum of 
detected electron neutrinos from the 
T2K experiment with the Super- 
Kamiokande detector. The data are 
significantly above the estimated back- 
ground and is in good agreement with 
the expectations from neutrino oscilla- 
tions [8]. The solid curve is the result of 
a fit to the oscillation model. 
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Very convincing evidence for a non-zero value of #13 was shown by the 
Daya Bay experiment. This used an array of six identical detectors to 
measure the De flux around six nuclear reactors. The detectors used li- 
quid scintillator viewed by photomultipliers. The reaction is the same as 
that used by KamLAND (see Section 11.4.5) and is also a disappearance 
experiment, but over baselines of ~1 km, rather than the ~100 km stud- 
ied by KamLAND, which is why there is sensitivity to the mixing angle 
0:3. There was a separate layer of purified water outside the scintillator 
to act as a veto detector (for example to reject incoming cosmic rays). 
The detectors nearer the reactors measured the flux, and this was used 
to predict the flux at the further away detectors. The ratio of meas- 
ured to predicted flux at the further away detectors was significantly 
less than 1, which is evidence for a non-zero value of 013. This result was 
confirmed by RENO (another reactor experiment). Further confirmation 
of the non-zero value of 6:3 was provided by the T2K experiment. This 
used a very intense 30 GeV proton beam at the JPARC accelerator to 
send a neutrino beam to the Super-Kamiokande detector at a distance 
of 295 km. In order to obtain the maximal oscillation probability for this 
baseline and the expected value of Am?,, the optimal neutrino energy 
was E, ~ 1 GeV. This was achieved by having the beam line at an angle 
of 2.5° with respect to the line from the accelerator to the detector. Clear 
evidence for ve appearance in a V, beam is shown by the spectrum of 
identified ve interactions (see Fig. 11.10) [8]. 

A global fit to these and other data gave a value of sin? 20:3; = 
0.096 + 0.013/115]. A necessary but not sufficient condition for CP 
violation in the neutrino sector is that 013 is non-zero. Therefore, the 
observation of this large value for 013 opens the way to the study of 
CP violation in the neutrino sector, which would be very interesting for 
the reasons discussed in Section 11.5.4. This would require measuring 
different oscillation rates for neutrinos and antineutrinos. Several ideas 
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for new very long-baseline experiments are being considered. They are 
very challenging because potential small differences between neutrino 
and antineutrino oscillation probabilities need to be measured, which 
will require very intense neutrino beams as well as very large detectors. 


11.5.4 Matter—antimatter asymmetry 


The Universe appears to be dominated by matter compared with anti- 
matter. If there were stars or galaxies made of antimatter, we would 
expect characteristic y rays from matter—antimatter annihilation. There 
is no evidence for this from astrophysics. The primary cosmic-ray flux 
contains only a small component of antiparticles such as positrons, and 
these can be explained by secondary production processes. As well as 
explaining this effect, we also need to understand the baryon-to-photon 
(ng/ny) ratio. In the early Universe, with temperatures satisfying 
kgT >> 2my, we can have the reactions pp + yy, which would have 
been in thermodynamic equilibrium, and the value of ng/n would be 
determined by the Boltzmann and statistical factors. However, as the 
Universe cooled, the interaction rate became lower than the inverse of 
the expansion time, so the baryons would have been ‘frozen out’. The 
Standard Model prediction?’ for ng/n  ~ 10719 is much lower than the Using the CKM mechanism. 
measured value ng/ny ~ 107'°. 

If the Universe started off with matter—antimatter symmetry, then we 
must satisfy the three Sakharov conditions [124] in order to have the 


currently observed matter-dominated Universe:?9 29The explanation of the matter- 


antimatter asymmetry can also resolve 
(1) There must be baryon number violation. the problem with the ratio ng/ny. 


(2) There must be C-symmetry and C P-symmetry violation. 


(3) Interactions must occur at an epoch in which the Universe was not 
in thermodynamic equilibrium. 


The first condition is obvious, but the second is not so self-evident. If 
C symmetry held, there would be an equal rate for the production of 
baryons and antibaryons. Similarly, if CP symmetry held, there would 
be an equal rate for the production of ‘left-handed’ baryons and ‘right- 
handed’ antibaryons and an equal rate for the production of ‘left-handed’ 
antibaryons and ‘right-handed’ baryons Therefore, the second condition 
is required to produce more baryons than antibaryons. If the Universe 
were in thermodynamic equilibrium, CPT symmetry would ensure that 
there was an equal rate for reactions creating baryons and antibaryons. 

The SM does have CP violation in the quark sector, but it turns out 
that the off-diagonal CKM elements are so small that this cannot pro- 
vide sufficient CP violation to explain the observed matter—antimatter 
asymmetry and ng/n,. One possible explanation would be to invoke a 
Grand Unified Theory (GUT), which would naturally generate baryon- 
number-violating interactions since quarks and leptons are in the same 
multiplets. However, these GUT models tend to predict a rate of proton 
decay that is incompatible with measurements. 
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301p Chapter 8, we looked at the search 
for neutrinoless double 8 decay, which 
if observed would prove that neutrinos 
are Majorana particles. 


31We require the masses to be positive, 
so clearly the parameter M must be 
imaginary. 


There are many different theoretical attempts to explain the observed 
asymmetry, although here we consider only one model. An attractive 
possibility is that the CP violation arises in the neutrino sector. We 
already know that the magnitudes of the off-diagonal elements of the 
neutrino mixing (PMNS) matrix are relatively large. If the one com- 
plex phase, 6, in the PMNS matrix turned out to be large, this would 
predict relatively strong CP violation in the neutrino sector. Now that 
we know that neutrinos have non-zero masses, they could be Dirac or 
Majorana particles (see Chapter 6).°° We will consider a model in which 
the neutrinos are Majorana particles. One of the unexplained features 
of the SM is that the neutrino masses are so much smaller than those 
of the charged leptons and the quarks. One attempt to explain this is 
motivated by GUTs in which the neutrino mass matrix is given by 


M, = a r) (11.45) 


The value of M is of the order of the electroweak scale, but the value 
of the parameter B is of the order of the GUT scale, so B >> M. The 
masses of the neutrinos are then given by the eigenvalues of the mass 
matrix (eqn 11.45): 


At = 1 (B + v B? + 4M?) (11.46) 


In the regime B > M, this gives à} = B and à- = —M?/B.3! 
Therefore, larger values of B increase (decrease) the mass of the heav- 
ier (lighter) eigenstate—hence this is called the ‘see-saw’ mechanism. 
The lighter mass corresponds to the observed left-handed (LH) neu- 
trino states and the heavier mass state would be a right-handed (RH) 
neutrino. 

In the early Universe with temperatures kgT > m(vpr), these LH 
and RH neutrinos would have been in thermodynamic equilibrium and 
there could have been no asymmetry. As the Universe cooled to lower 
temperatures, it would no longer be in thermodynamic equilibrium (thus 
satisfying the third Sakharov condition) and the abundance of the heavy 
RH neutrinos would have been ‘frozen-out’. They would have then de- 
cayed to the light neutrino states in a C'P-violating way to create a 
lepton number asymmetry. The lepton asymmetry can be converted to 
a baryon asymmetry in the SM. There is no perturbative SM process in 
the SM that can violate B + L (where B is the baryon number and L 
the lepton number), but this can happen non-perturbatively. The SU(2) 
vacuum has an infinite number of minima, each with a different value 
of B — L. At low temperatures, the probability of transition from one 
vacuum state to another is suppressed by a factor of e~“w/T, where 
My is the mass of the W boson. However, at very high temperatures 
kgT > Mwy, the Universe can easily hop over the barrier from one 
vacuum state to another, thus allowing violation of B — L. This then 
allows the lepton number asymmetry created by the decays of the vp 
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to be converted to a baryon number (and lepton number) asymmetry. 
As electric and colour charge are protected by gauge symmetries, the 
Universe remains neutral for electric and colour charge. 


Chapter summary 


Very convincing evidence for neutrino oscillations has been observed 
in the study of atmospheric neutrinos and confirmed by laboratory 
experiments. 


Similarly convincing evidence for neutrino oscillations has been found 
with solar neutrinos and confirmed by a reactor antineutrino experiment. 


In a simple extension of the minimal Standard Model, neutrinos can 
have non-zero masses. According to quantum mechanics, it then becomes 
possible for one flavour of neutrino to oscillate into another flavour. 


The discovery of a non-zero mixing angle 613 opens the way for the 
search for CP violation in the neutrino sector, which might explain the 


matter—antimatter asymmetry in the Universe. 


Further reading 


e Perkins, D. (2009). Particle Astrophysics (2nd edn). 
Oxford University Press. This is a very clear and 
accessible textbook on the many areas in which 
astrophysical measurements can yield interesting con- 
straints on particle physics. 


e Akhmedoy, E. Kh. and Smirnov, A. Yu. (2009). Para- 
doxes of neutrino oscillations. Phys. Atom. Nuclei, 72, 
1363. This gives a very thorough discussion of a correct 
description of neutrino oscillations. 


Exercises 


(11.1) Show that we obtain identical formulae for neu- 
trino oscillation probabilities if we assume that 
the two mass eigenstates have the same energy 
but different momenta, as opposed to the assump- 
tion of the same momentum but different energy 
used to derive eqn 11.5. 


(11.2) (a) Consider the decay m — uvi, where vi is a 
mass eigenstate with mass m;. Assume there 
are two neutrino mass eigenstates, one with 
mı = 0 and with a splitting in mass given by 
Am}, = 1074 eV?. Determine the momenta 
and energies of the two neutrino mass eigen- 
states (in the pion CMS), and hence show 


that neither the momenta nor the energies 
are the same, and determine the difference in 
momenta of the two flavours of neutrinos. 


If the decay is localized in some linear region 
of length l, use the Heisenberg uncertainty 
principle to determine the uncertainty op in 
the momentum of the neutrino. For what 
value of | would op be equal to the difference 
in momenta between vı and 12 states. 


Comment on your results. 


(11.3) The result that we get two atmospheric muon 
neutrinos for each electron neutrino, in a given 
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(11.4) 


(11.5) 


neutrino energy interval, is not obvious given the 
steeply changing cosmic-ray spectrum. Starting 
from a 2GeV pion, compute roughly the aver- 
age energy of the v and p into which it decays. 
Then, again using a rough calculation, divide the 
energy of the u equally among the decay prod- 
ucts to show that all the neutrino energies are 
roughly equal, so the naive result works. This is 
a lucky coincidence; repeat this starting with a 
2 GeV kaon to show that the sharing is not equal. 


SN1987A was a supernova explosion that oc- 
curred at a distance of approximately 170000 
light years from Earth. The burst of neutrinos 
preceded the observation of the supernova with 
optical telescopes (this was expected because it 
takes time for the shock wave to reach the sur- 
face). The neutrino pulse lasted for about 10s and 
the neutrino energy range was about 10-20 MeV. 
How could such neutrinos be produced in the 
supernova and how could they be detected? Use 
these data to estimate an approximate upper 
limit for the neutrino mass. 


Find the eigenvalues of the Hamiltonian given 
in eqn 11.23. Hence show that the difference 
in energies of the two eigenstates is given by 
AE = Am?/2p, as expected for ultrarelativistic 
neutrinos. 

Assuming that neutrinos are ultrarelativistic, 
derive eqn 11.23. Hint: You can drop any term in 


(11.6) 


(11.7) 


(11.8) 


(11.9) 


(11.10) 


the Hamiltonian that is proportional to the unit 
matrix, since it does not affect oscillations. 


Estimate the resonant neutrino energy, assuming 
that the electron density at the centre of the Sun 
is N(r = 0) © 3 x 10°! m~’. 

Explain the factor of 1.27 in eqn 11.12. If Am? = 
0.003eV? and L = 700km, determine the neu- 
trino energy required to maximize the effect of 
oscillations. 


If we wish to make a neutrino beam from the de- 
cays of rt with a neutrino energy ~3 GeV, make 
an approximate estimate of how long a decay 
path is required. 


Calculate the neutrino threshold energy for the 
three reactions given by eqns 11.16-11.18, assum- 
ing that all neutrino masses are negligible. How 
can the reactions be identified experimentally and 
why is it necessary to require a minimum recoil 
electron energy of about 5 MeV in practice? 


For elastic ve scattering at a fixed incident neu- 
trino energy Ev, find a relation between Ee and 0, 
the energy and angle relative to the incident v 
direction of the recoil electron. Assume that the 
electron is initially at rest. Show that your re- 
lation implies that the observed distribution of 
recoil electrons is strongly peaked in the incident 
v direction. Why is this important for studies of 
solar neutrinos? 


The Higgs boson 


The Higgs mechanism is a part of the standard model that we post- 
poned from the discussion of electroweak unification in Chapter 7. Until 
recently, it was one possible theoretical solution to an otherwise prob- 
lematic part of the GSW theory that we have highlighted in earlier 
chapters—that we want the W and Z to be massive to make a short- 
range force, but we want them to be massless so that gauge invariance 
works and the theory is renormalizable (Section 7.4.4). The Higgs mech- 
anism is as old as the rest of the GSW theory and predicts that there 
should be a boson, the Higgs, but does not predict its mass. 

Recently, strong evidence for a boson that fits the description has been 
found in several different decay channels at both of the big experiments 
at the LHC. In this chapter, we will first discuss the theory behind the 
Higgs boson and then take a look at the experimental evidence for its 
existence. 

We start at a ‘middle difficulty’ level, i.e. try to explain all the key 
ideas without too much mathematics.! Then, having set the scene, we 
give a more sophisticated mathematical picture of the Higgs mechanism 
in Section 12.4. 

The discussion of the Higgs mechanism starts with all the particles 
being massless. We then postulate a new field with which all particles 
can interact. The interaction of a particle with this field produces an 
effect that makes the particle appear to have mass; i.e. the field produces 
an effect when the particle moves and this manifests itself as inertia. To 
begin with, we have a free hand to choose whatever form of new field 
we like, but it must respect both Lorentz invariance and local gauge 
invariance. 

The type of field that we need has already been introduced in 
Section 7.2.6. Now there are certain properties that the new field must 
posses: it must respect Lorentz covariance, it must be colourless and 
uncharged, and it must have spin zero.” The field is related to the weak 
interaction, to give masses to the W and Z, so we will allow it to have 
both weak isospin and hypercharge (recall that these are related to the 
electric-charge equivalents for the W+, W°, W~ and for the B®, re- 
spectively; see Section 7.4.1). Finally, it must be locally gauge-invariant, 
because we want it to make the whole theory gauge-invariant. 


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg. © Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg 2016. Published in 2016 by Oxford University Press. 
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| Specifically, as we made clear in earl- 
ier chapters, we are not using the 
language of relativistic quantum field 
theory (RQF). How this affects our 
treatment of the Higgs mechanism will 
be covered briefly in Section 12.4. 


?We must be relativistically consistent 
and also not allow the field to interact 
via any of the known forces. Otherwise, 
when our massless particle (e.g. an 
electron) interacts with it, thus gain- 
ing mass, it could also exchange spin 
or charge with the field. This would 
give the impression that the electron 
spin or charge could change spontan- 
eously, which would not agree with 
experiment. 
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V(E) 


Electric field E 


Fig. 12.1 Potential energy associated 
with a field such as the electric field in 
a capacitor. 


12.1 Local gauge invariance 


We give a brief reminder of what gauge invariance entails. It is an im- 
portant property that fields can, but do not have to, possess. Those that 
are locally gauge-invariant are, by ’t Hooft’s theorem, renormalizable, 
which is important for the reasons described in Section 7.4.4. We have 
already considered gauge symmetry in the familiar context of classical 
electric and magnetic fields (see Chapter 6). We showed that the classical 
observable fields B and E are unchanged by the simultaneous changes 
to A and ¢ as follows: 


A—A’=A+4+VA, ¢-¢ =b-— (12.1) 


where A is a scalar. A can be a function of x and ¢ and, when it is, 
we call it a local gauge transformation. In the context of relativistic 
quantum mechanics, we showed that the Dirac equation for a particle 
of charge q in an electromagnetic field (A, ¢) is gauge-invariant if the 
Dirac wavefunction W is also transformed: 


poy = pet (12.2) 


The effect of the simultaneous gauge transformations on the A, ¢, and 
w fields on the Dirac equation 


($ -ovsa (V-t sme (123) 


leave the physics unchanged (see Exercise 12.1). From demanding that 
A can be different for different space-time coordinates (local gauge sym- 
metry), we can determine the form of the electromagnetic interaction. 


12.2 Spontaneous symmetry breaking 


We want to arrange for there to be a Higgs field so that a massless 
particle can feel an interaction with it and act as if it has a mass. All 
fields we have encountered so far have been ‘in the vacuum’, i.e. they 
are normally switched off (apart from quantum fluctuations) if nothing 
is happening, so the ground state is zero. Our new field is different. It 
must be on all the time to generate particle masses. 

The simplest type of field is a scalar field ¢, which is a single real 
number that can vary as a function of x and t. To make it gauge- 
invariant, we will have to make it a complex number ¢ = ¢ġı + ide. 
We need to understand what is involved with making the field be on all 
the time. We start with a field that is real, so we set @2 = 0. 

Familiar classical fields have energy associated with them when they 
are on; for example, there is energy associated with a charged capacitor 
(which is stored in its electric field) that is not present when the capacitor 
is discharged. This is represented in Fig. 12.1, which shows the potential 
energy associated with a field as a function of the value of the field. 
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It has a minimum, the value of the field when there is nothing there, 
which is usually taken to set the zero of the potential. Examples of how 
the potential energy V of a field with this property could vary with the 
value of ¢@ are V œx 67, V x +, and V œ cosh¢—1. There are many 
others. 

Thinking pictorially, we can imagine a ball rolling around in the bot- 
tom of the potential well shown in Fig. 12.1. If it has no energy, it will 
sit in the bottom at ¢@ = 0. If it has some energy, it can oscillate by 
rolling around in the bottom of the well. 

Most of the time when we are using quantum mechanics, we calculate 
using Feynman diagrams or by applying perturbation theory, which is an 
expansion in terms of a ‘small’ parameter, for QED the fine-structure 
constant a. The order of perturbation theory (or Feynman diagrams) 
required will depend on the accuracy required. This is analogous to using 
a Taylor series expansion to calculate a mathematical function; however, 
going beyond the first order in RQF raises some difficult mathematical 
issues. To make perturbation theory work, we need a region in which @ 
is slowly varying. For the field theory case in which we are interested, 
this corresponds to expanding about the minimum of the potential. For 
example, for a potential with a minimum at ¢ = 0 as shown in Fig. 12.2, 


we can consider expansions about ¢ = 0. 0 
Now consider an example of a field that does not have its potential Field ¢ 
energy minimum at ¢ = 0, as shown in Fig. 12.3. The potential in this 
example varies as Fig. 12.2 Potential energy function 


with a minimum at ¢ = 0. 


V(b) = Not — Sue? (12.4) 


where A and u are both real constants. This has two minima, which are 
at @ = +yu/A. The vacuum state of the field is now not the one with 
‘no-field’ (¢ = 0) but the one with either ¢ = +y/X. This means that 
we need to consider quanta of the field as excitations with respect to 
this non-zero field value, and the vacuum acquires a non-zero ‘vacuum 
expectation value’ (often shortened to ‘vev’ in RQF texts). This is the 
key idea of spontaneous symmetry breaking. The fundamental theory 
respects a symmetry but the symmetry is ‘hidden’ (or broken) in the 
vacuum state. If such a field were to exist in nature, we would have 
to do our Feynman diagram expansions about one of the minima, a 
procedure that we will explore in this chapter. These ideas are worked 
through for this potential in Exercise 12.6. In terms of a classical analogy 
and thinking pictorially with a ball in a potential well, the ball can now Fis: 12-3 Potential energy function 
7 i n V(¢) for a field with a minimum at 
rattle around in the bottom of either minimum. non-zero values of ¢. 
However, as noted above, the real scalar field considered so far does not 
work for the Higgs mechanism since it is not possible to make it locally 
gauge-invariant. So we now consider a complex scalar field ¢ = $1 +id2 
in which ¢ does not have to be zero. We will use the potential 


Vio) 


Field 


V = (69)? EIGA) (12.5) 
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Fig. 12.4 ‘Mexican hat’ potential for 
the Higgs field. The vertical axis 
represents the potential and the 

horizontal axes represent the real and 

imaginary parts of the complex Higgs 
field. From Millard, Rupert. Higgs 
Mexican hat potential. http:// 
commons. wikimedia.org/wi 
Mexican_hat_potential_polar.svg. 


Fig. 12.5 Spontaneous symmetry 
breaking in an isotropic ferromagnet: 
(a) symmetry above the Curie 
temperature; (b) hidden symmetry 
below the Curie temperature. 


3This discussion is greatly simplified. 
In a real ferromagnet below the Curie 
the spins are aligned 
within small domains but the orienta- 
tion of the spins in different domains is 


temperature, 


random. 
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which is a logical extension of the potential we have used up to now 
when we change from real @ to complex ¢. If we substitute @ = ¢1 + id2 
into this potential, we get 


V= EXA + 08)? — Enla? + 93) (12.6) 


This function is shown in Fig. 12.4. The potential has a likeness to a 
Mexican hat; the minimum has become a ring when ¢7 + $3 = (u/A)? 
and the phase of the complex number is arbitrary. If we picture a ball 
rolling around in this potential, we can see two degrees of freedom. One 
of these has the ball rolling up and down the sides in the same way as for 
the ball in Fig. 12.3; i.e. ¢? + 3 oscillates and the phase of the complex 
number stays fixed. The other way, which is new, is for the ball to roll 
around the bottom; i.e. the value of ¢7 + ¢3 stays fixed, but the phase 
of the complex number changes. This second degree of freedom requires 
no energy for it to occur. In quantum field theory, it corresponds to the 
existence of a boson called a ‘Goldstone boson’, which has no mass. No 
such boson exists experimentally, but we will address this shortly. 

The concept we have been exploring here is called spontaneous sym- 
metry breaking. The potential is symmetric (i.e. the minimum can be 
anywhere around the rim of the Mexican hat), but in nature this sym- 
metry is hidden, i.e. the field has to pick one particular value. Another 
example of spontaneous symmetry breaking in nature comes from the 
spin alignment behaviour in an isotropic ferromagnet. In Fig. 12.5(a), 
above the Curie temperature, the spins are randomly aligned. The situ- 
ation is isotropic, i.e. rotationally symmetric. In Fig. 12.5(b), below 
the Curie temperature, the spins align (an electrostatic effect due to 
quantum-mechanical wavefunctions of the electrons).2 The symmetry 
is now hidden. It is spontaneously broken. The symmetry still exists, be- 
cause any direction for the spin alignment could equally be chosen. This 
is analogous to the Higgs field that is inserted into the standard model. 


12.3 Higgs mechanism—the simplified 
story 


In Section 12.4.2, we will give a mathematical description of how the 
above concepts fit together to make the Higgs mechanism. In this section, 
we give a simple picture of the ideas underlying the Higgs mechanism. We 
start with the equation for a spin-0 particle that obeys the potential we 
have just described, V = 4\?¢* — $y°¢?, which can behave as if it were 
‘on’ all the time. This contains a degree of freedom corresponding to the 
ball rolling round the bottom of the potential, which requires no energy 
and causes the theory to predict a massless boson, the ‘Goldstone boson’. 

We then add a ‘vector’ (spin-1) field that is locally gauge-invariant. 
This field has to be massless (in order to respect the gauge invariance) 
before we consider the effect of the non-zero expectation value of the 


scalar (Higgs) field. After spontaneous symmetry breaking, the vector 
field interacts with the scalar field and acquires a mass because of the 
non-zero vacuum expectation value of the scalar field. We then extend 
this in a way that allows four fields rather than just one to be added; 
these four fields fit the properties of the W, W°, and B® (from the first 
three steps of the electroweak unification procedure from Section 7.4.1). 
By making a few subtle rearrangements of the equation, we are able to 
organise the terms so that 


the Goldstone boson disappears; 


e the WŁ and the Z? appear to have a mass; 
e the photon remains massless; 
Mw /Mz = cos Ow. 


So, almost miraculously, the new term in the potential makes everything 
come out exactly as we need in order to agree with experiment. 
Furthermore, we can repeat the procedure with a fermion, by adding 
it in a similar way and insisting on local gauge invariance, with the can- 
cellation of terms occurring between the Higgs field and the fermion, and 
this generates the mass term for the fermion. We can keep doing this with 
all the fermions in the Standard Model (SM) to get masses for each of 
them. The theory does not predict the mass of each fermion, but it does 
predict that the couplings gy; are proportional to the masses of the 
fermions. The theory also does not specify the mass of the Higgs boson. 


12.4 Lagrangians 


Before moving on to applying the Higgs mechanism in more mathemat- 
ical detail, we shall make a short detour. Lagrangians are a very powerful 
alternative to Newton’s laws in classical mechanics.4 The Lagrangian 
approach also works in quantum mechanics and it provides the math- 
ematical framework for RQF. These ideas were touched on briefly at the 
start of Chapter 6. While the Dirac equation allows only a single par- 
ticle or a system of a fixed number of particles to be considered, RQF 
allows the number of particles to change. RQF can cope with creation 
and annihilation of fermion pairs, for example, and with having several 
different fields within one equation at the same time. 


12.4.1 Lagrangians in classical mechanics 


Lagrangian mechanics follows a specific recipe, which has three steps. 
The first step is to pick the correct number of independent variables 
for the system. For example, a pendulum that moves in one dimension 
has one independent variable, 0. A pendulum that can swing in any 
horizontal direction has two variables; we will choose the angle to the 
vertical 0 and the azimuthal angle in the horizontal plane ¢. The second 
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4We have deliberately avoided using 
Lagrangian mechanics in this book be- 
cause of the mathematical overheads 
involved. However, this approach is par- 
ticularly useful for discussing the Higgs 
boson, so we give a flavour of it here 
that may be helpful when consulting 
more advanced texts. 


> Why it works is explained in many 
textbooks on classical mechanics. 
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By definition, £L =T — V. 


In this case, we can identify Lg as 
the component of angular momentum 
about the z axis. In the Lagrangian ap- 
proach, we can see that conservation 
laws (e.g. angular momentum) are a re- 
sult of the invariance of the Lagrangian 
with respect to a conjugate variable, in 
this case ¢. 


step is to write down an expression for the Lagrangian £ by working 
out the total kinetic energy of the system T and the total potential 
energy of the system V. This must be in terms of the independent 
variables and their first derivatives with respect to time (so for the 
example of the two-dimensional pendulum, these are 0, Å, Q, d). For 
the two-dimensional pendulum with a mass m attached to a string of 
length l, the Lagrangian is 


ee 1 , 
L= zm (10) + z™m(lġsin 0)? — (1 — cos 0)mgl (12.7) 


where g is the acceleration due to gravity. 

The third step is to use the Euler-Lagrange equations with £. There 
is one Euler-Lagrange equation for each independent variable and each 
gives an equation of motion (which are usually coupled). We could 
have tried to start directly with these equations of motion, but one 
of the beauties of the Lagrangian method is to be able to manipulate 
everything in one equation before having to deal with simultaneous equa- 
tions. The Euler-Lagrange equations written for the two variables in our 
two-dimensional pendulum problem are (see Exercise 12.2) 


d (ƏL\ ƏL d (ƏL\ ƏL (12.8) 

dt \ðġ) 30’ dt \ðġ) 3% i 
The partial derivative with respect to 0 is taken holding all of the other 
variables and their derivatives (0, ¢, and ¢) constant, and similarly for 


the other partial derivatives. For the pendulum, the resulting equations 
of motion are 


16+ gsind = 0, 20¢sin 6 cos @ + $sin? 6 = 0 (12.9) 


A general feature of Lagrangians is that if an independent variable does 
not appear explicitly in the Lagrangian, but only its time derivative (in 
our example, ¢ does not appear explicitly), then the right-hand side of 
the corresponding Euler-Lagrange equation is zero. Such a coordinate is 
called cyclic and there is a conserved quantity associated with it; in our 
example, this is 0£/ 0¢ = ml? sin? 6¢ = Lo. This result is immediately 
apparent from the ¢ Euler-Lagrange equation.” 

Another general feature is the ability to move terms between being 
considered as kinetic or potential energy, giving what is referred to as an 
effective potential Veg. In the example, we can change the Lagrangian 
by inserting Le, which is constant: 


ee (10)? 4 1 (1 0)mgl (12.10) 
==m cos #)m ; 
2 2ml? sin? 8 á 

We can now pretend that the second term is part of the potential en- 
ergy (even though it started as part of the kinetic energy) and we have 
an equation involving only one degree of freedom moving in a ficti- 
tious potential describing our two-dimensional pendulum. This feature 
of swapping terms around within the Lagrangian carries over to the use 
of Lagrangians in RQF and is very helpful for understanding the Higgs 
mechanism. 
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12.4.2 Lagrangians in quantum mechanics 


We shall not dwell too much on Lagrangians in quantum mechanics. The 

terms in the Lagrangian still correspond to what we loosely call kinetic 

and potential energy. Just as it is possible to have several objects in 

one classical Lagrangian, it is possible to express a system with many 

fields as a single Lagrangian. For each field in the Lagrangian, there is an 

analogue to the classical Euler-Lagrange equations,® and different terms ®These have more parts to them 
in the Lagrangian produce recognizable expressions on application of the than the classical ones, however, be- 
Euler-Lagrange equations. For example, terms in the Lagrangian from a aie ne teed wo maintain Lorentz 
spin-0 field give the Klein-Gordon equation, while terms from a spin-4 

field give the Dirac equation. This also applies to spin-1 fields, and in 

the case of the photon (a spin-1 field with zero mass), one can use the 

Lagrangian approach to generate Maxwell’s equations. 


12.5 Higgs mechanism—more 
mathematical 


We are going to repeat the discussion above, but now following the terms 
that appear in the equations more closely. We will use Lagrangians since 
the formalism is easier to follow and this will provide an introduction 
to the detailed explanations in more advanced textbooks. We will build 
up the concepts that go into the Higgs mechanism with four examples. 
The particular potentials used here are examples only, and there are a 
number of different fields that can be made to work theoretically, but 
using these examples gives a good insight into how the mechanism works 
in principle. 


Example 1 


We start with the Lagrangian for a spin-0 particle with no potential 
energy term: 


1 
L= z nHO" H) -= sme (12.11) 
—_ —r 
K.E. term mass term 


Applying the Euler-Lagrange equation to this (see Exercise 12.4) gives 
the Klein-Gordon equation for a spin-0 particle, 0,,0“¢ + m?¢ = 0. The 
second term in the Lagrangian gives rise to the mass term. For it to be 
a mass term, it must be proportional to the square of the field and it 
must be negative. 


Example 2 
We now modify this by setting m = 0 (so the mass term disappears) but 
we add a potential V = $126? (remember that £ = T — V): 


1 1 
£= 5(0,0)(0"8) — 510? (12.12) 
eee Ya’ 
K.E. term P.E. 
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IWe have dropped a constant term 
because it disappears when the Euler- 
Lagrange equations are applied, so we 
can ignore it. 


Notice how the second term looks just like a mass term. If the particle 
is in the field that has produced this type of potential, it will behave as 
if it has a mass. Now we change to a more sophisticated potential of the 
type discussed earlier, V = 574 — ip? g’: 


1 1 1 
£= 5(O,6)(0"9) -37 HE (12.13) 
K.E. term interaction not a mass term 


Terms that are third or fourth powers of a field (or combination of fields) 
represent interactions in RQF. So the ¢* term above represents an inter- 
action. The final term in this Lagrangian is a problem—we do not know 
how to interpret it. It is not a mass term, because it has the wrong sign. 

However, if we change our zero point in the field, what happens? To 
do this, we need to re-express the field ¢ in terms of a new field p that 
is zero at the bottom of one of the two minima. Let ¢ = p + y/A, where 
the + reflects the fact that there are two minima that can be used. We 
get (see Exercise 12.6): 


1 1 
£= 5(Anp)(O"p) = wep? Ep’ — 7 d*p" (12.14) 
A mass _—-—-_ 
K.E. interactions 


Now we have a mass term! 


Example 3 


We need to make ¢ complex to proceed with local gauge invariance. Let 
ob = ¢1 + ide, where ġı and ¢2 are real, and consider the potential V = 
1)2(gd)? — dy2d*o = 1A?(9? + 43)? — 1u? (g? + 93). The Lagrangian 
becomes 


£ = 5 (2ud1)(B"G1) + 5 (Oud2) 062) 


1 1 
— FA? + 63) + 5126 + 68) (12.15) 


The first two terms are kinetic energy terms. Next, as in Example 2, 
we expand around a minimum. In this case, we choose ¢, = p+ bu/A 
and ¢2 = p’. This is where the minimum is on the positive real axis. 
We could do it about any point, and would get the same result (but 
perhaps after quite a lot of algebra and some redefinition of the fields). 
The Lagrangian is now 


p scalar, m=V/2 u p’ scalar, no mass 


y nm 
1 1 
L= 5(Aup)(O"p) =P + ONO) 
(12.16) 
3 12 1 2/4 14 2/2 
= HACE" + pp”) — GA"(p" +20 p) 
— aiaia 


interactions 
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The p field becomes a massive field with mass v2 pu and the p’ is a mass- 
less field (the Goldstone boson we discussed earlier). There are several 
interaction terms and we have omitted a constant term. 


Example 4 


Next, we apply local gauge invariance, by adding a massless vector field, 
i.e. a field with spin 1 like a photon field. There is a prescription— 
it makes the terms we are familiar with in the local gauge invariance 
discussion cancel properly. We start from Example 3:10 


K.E. term modified for local gauge invariance 


pN 


L= —|(0, — igA,)(¢1 — id2)|[(O"% + iqA”)(ġı + i¢2)] 


N| = 


(12.17) 


1 1 1 F 
sE SS 
-V K.E. for vector field 


where FH” = 0H AY” — 0” A} (see Exercise 12.5). We now expand, as is 
becoming familiar, about a minimum in the potential: ¢ı = p + u/Aà and 
$2 = p'. The result is 


p scalar, m=V/2 u p’ scalar, no mass 

ON i 

1 H 22 1 1 HA 

L= Zup) — wep + gup Oo) 

1 2 ; 
hs +3 EAA" - qË” 4,0" o" (TAB) 

SS e+ 

vector K.E. vector mass problem 


+ interaction terms 


This is now locally gauge-invariant. The spin-1 field has acquired a mass 
(the second term on the second line). However, there remains a problem, 
namely the third term on the second line, which looks like an interaction 
that allows the A” field to spontaneously change into the p’ field. 

There is still another trick up our sleeves. We can now pick a particular 
gauge. Although the form of the Lagrangian will change when we do 
this, its physical meaning will stay the same (that is what we mean by 
gauge-invariant). We change ¢ with a phase as follows: ¢ — e? ġ, where 
tan = —¢9/¢,. This particular choice makes the p’ field disappear, but 
we are constrained by local gauge invariance and we are modifying the 
A" fields to compensate for this. What we get is 


massive scalar massive vector 
a———~_— a 2 
1 1 siloak 
£= 5(dup)("p) o E + SP aAA (12.19) 


+ interaction terms 


10 This equation is eqn 13.108 from 
Burcham and Jobes and eqn 10.129 
from Griffiths—see Further Reading. 
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ltt hasn't quite disappeared—when 
the vector field became massive, it 
changed from having two polarization 
states to having three. 


12 For the latest results, see the ATLAS 
and CMS links on this book’s website. 


Fig. 12.6 Theoretical cross section for 
different Higgs production mechan- 
isms [114] as a function of my for pp 
interactions at ys = 7 TeV. 


We have created a mass for the spin-1 (vector) field and the Goldstone 
boson has disappeared.!! 


This is the Higgs mechanism. 


The real Higgs theory is an extension of the above outline. A somewhat 
more complicated ‘doublet’ of Higgs scalars is used to begin with—this 
allows local gauge invariance to be achieved in the manner shown above 
with massless versions of all four fields of interest (W+, W°, and B®). 
We choose a place in the Mexican hat potential to expand around as 
before, and choose the gauge to make the problematic ‘A,,0"p’’ terms 
go away. 

When we do this, everything works out to agree with experiment, 
as described in the simplified description in Section 12.3; the W= and 
Z? each acquires a mass and the y remains massless. Also, mw /mz = 
cos 0w and the scalar Higgs remains massive. However, the theory does 
not predict the mass. 


12.6 Higgs discovery 


In this section, we review the evidence for the discovery of the Higgs 
boson. The experimental results shown come from the ‘discovery’ pa- 
pers.!? As mentioned above, the SM does not predict the mass of the 
Higgs boson. However, once the mass is known, all the properties are 
fully specified. The predicted cross section as a function of Higgs mass 
my is shown in Fig. 12.6 [114]. From the direct LEP Higgs search and the 
indirect constraints from the precision electroweak data, the 95% con- 
fidence level for the expected mass is 114GeV < my < 149GeV [115]. 
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In this mass range, the dominant production mechanism is via a top- g 
quark loop, because of the very large top-quark mass (see Fig. 12.7) and 
the large gluon parton distribution function in this x range. 
The branching ratios of the Higgs boson as a function of my are shown P 
in Fig. 12.8 [114]. We can see that the Higgs boson tends to decay to 
the heaviest particles that are kinematically allowed, since the mass of = Yr — X----- 
a particle depends on its coupling to the Higgs boson. In the expected 
range for my, the largest branching ratio is for decays into bb quarks; t 
however, this channel is almost impossible to study in pp interactions 
in this production mode! because of the large irreducible background 
from QCD production of bb quarks. g 
At lowest order in perturbation theory, the decay H — yy would not 
occur, since the photon mass m, = 0. However, the decay can occur Fig. 12.7 Feynman diagram for Higgs 
through virtual t and W loops as shown in Fig. 12.9. There is significant a a aa 
negative interference between the two diagrams. 13 ; . i 
The most important decay channels for the Higgs boson discovery are Ati alternative production mode i 
VH, where V is either a W= or a Z. 
as follows: The cross sections are smaller but this 
mode has much smaller backgrounds 


and is expected to be observed at the 


e H — yy: At first sight, this might appear very surprising because LHC. 
the branching ratio is only about 0.2%. The invariant mass of the 
yy system can be determined rather precisely, so the search in 
this channel then involves looking for a small ‘bump’ on top of a 
smoothly falling background. This has the additional benefit that 
the background can be determined by an empirical fit to the yy 
invariant mass distribution. 


e H — Z Z*: For the actual mass of the Higgs boson, this channel 
is below threshold for production of two real Zs; therefore, one 
of the Zs will be off mass shell (denoted Z*), which reduces the 
branching ratio to very small values. In addition, only the decays 
of the Zs into electrons and muons are used. The advantages of 


Higgs BR + total uncertainty 
a 


1072 


-3 1 
10 1000 


my (GeV) 


100 200 S00 400: 500 Fig. 12.8 Higgs branching ratios [114] 


as functions of mH. 
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Fig. 12.9 Feynman diagrams for the 
decay H — yy via a top-quark loop 
(a) and a W loop (b). 


MThe ‘reducible’ background is one 
that could be removed if the detector 
were perfect. The ‘irreducible’ back- 
ground has the same final state as the 
signal, and hence it is impossible to 
remove it on an event-by-event basis. 
However, in general, we can use some 
distributions (e.g. the invariant mass) 
to make a statistical separation be- 
tween signal and background. 


15The azimuthal angle can be deter- 
mined from a line joining the shower 
centre to the centre of the detector. 
In the longitudinal direction (z) the 
event vertex distribution has a signifi- 
cant spread in z. The longitudinal angle 
can be determined from the shower 
centres in the different longitudinal lay- 
ers. Alternatively, an algorithm can be 
used to determine which is the correct 
vertex (at typical LHC luminosities, 
there will be about 20 vertices) and the 
y longitudinal angle can then be de- 
termined from a line joining the event 
vertex to the shower centre. 


Fig. 12.10 Response of the ATLAS 
EM calorimeter [26] for example events: 
(a) z? — yy; (b) prompt y. The 
prompt y makes a single shower, 
whereas the two ys from the 7° decay 
show evidence for two distinct clusters. 


this channel are that it is very clean and also allows for a precise 
reconstruction of the invariant mass of the ZZ*. 


e H — WW: This channel is also suppressed for the actual mass 
of the Higgs boson because one of the Ws is off mass shell. The 
decay modes of the W that are used are W — eve and W —> uvu- 
The signal is larger than for Z Z*, but as there are two neutrinos in 
the final state it is impossible to reconstruct an invariant mass. The 
transverse mass (see Section 12.6.3) is used, because the spectrum 
has an endpoint at the value of the invariant mass. This means 
that the signal peak is very broad and a very careful prediction of 
all the SM backgrounds is required. 


12.6.1 yy channel 


We can have ys produced in the primary interaction (called ‘prompt’) 
or as a result of the decays of mesons (mainly 7°). There is a potentially 
large reducible background from events with one prompt y with the 
other jet faking a prompt y or from two-jet events with both jets faking 
a y. These reducible backgrounds can be suppressed with a very high- 
granularity electromagnetic (EM) calorimeter. n? — yy will tend to 
produce two distinct showers in the EM calorimeter as opposed to the 
single shower from a genuine prompt y. In addition, the prompt y will 
tend to be isolated whereas y from 7° decays will in general be part of 
a hadronic jet and will therefore not be isolated. The different responses 
in the ATLAS calorimeter for photons from a 7° and a prompt y are 
illustrated in Fig. 12.10. This allows the reducible background to be 
decreased to a level well below that of the irreducible background. The 
irreducible background is from the QCD production of prompt yy (see 
Fig. 12.11). 

The very good energy resolution of the EM calorimeter is critical for 
a precise reconstruction of the yy mass myy. However, the y directions 
must also be well measured for a precise reconstruction of my, and this 
requires good granularity in the EM calorimeter.!° The my spectrum 
is fit to a combination of an empirical background function and a signal 


distribution. The signal shape is taken from the SM prediction after 
simulating all detector effects, for given assumed values of my. Since the 
expected statistical significance of various classes of events (e.g. whether 
the y converted into an e*e~ pair in the tracking detector) is different, it 
is advantageous to plot the m,., spectrum weighted by the expected ratio 
of signal to background. An example of such a plot from the Compact 
Muon Solenoid (CMS) experiment is shown in Fig. 12.12 [64]. The plot 
appears to show a Higgs-like signal on top of a smooth background, 
but the discussion of the statistical significance will be deferred until 
Section 12.6.4. 


12.6.2 ZZ* channel 


In order to have a clean signal and to be able to make a precise recon- 
struction of the mass of the Higgs boson, we use this mode with electron 
and muon final states: 7/Z* — ete~ or Z/Z* — u™ u. In these decay 
modes, we have the best signal-to-background (S/B) ratio, but the rate 
is suppressed by the small branching ratios for the Z/Z* decay modes 
used. The main reducible background for this channel come from events 
with two ‘prompt’ leptons and two from the semileptonic decays of b 
quarks (e.g. Zbb and tt — WbWb). The relatively long flight path of B 
hadrons can be used to veto leptons from the decays of b quarks. The 
irreducible background is from ZZ* production without a Higgs bo- 
son as an intermediate state, which is indistinguishable from the signal, 
apart from the fact that the mass spectrum of the four leptons (may) 
for the signal will show a peak at my, whereas the background will 
be a smooth distribution. Clearly, the excellent energy and momentum 
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QI 


Fig. 12.11 Lowest-order Feynman 
diagram for prompt yy production. 


Fig. 12.12 Distribution of myy weigh- 
ted by S/B, with S and B being the ex- 
pected signal and background, respect- 
ively, from the CMS experiment [64]. 
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Fig. 12.13 Distribution of m4; from 
the CMS experiment [64]. The expected 
background and signal for the case 
my = 125GeV are also shown. The 
insert shows the mass distribution for 
a subsample of the events that pass 
a kinematic selection designed to opti- 
mize the ratio of signal to background. 
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resolutions for electrons and muons, respectively, help improve the S/B 
ratio. The measured distribution of m4; in the CMS experiment [26] is 
compared with the combination of the expected background and signal 
for a SM Higgs (with my = 125 GeV) in Fig. 12.13. 


12.6.3 WW* channel 


This channel benefits from the larger Higgs branching ratio compared 
with that for the 7Z* channel. The channel that is least contaminated 
by background and therefore has the best significance is the one with 
electrons and muons: W/W* — eve and W/W* — uvu. This decay mode 
has larger branching ratios than the decay modes we used for ZZ* (see 
Section 12.6.2), so the total number of events expected is significantly 
greater. The disadvantage of this channel is that there are two neutrinos 
in the final state, so it is not possible to determine the invariant mass 
(Mww-«). If there were only one neutrino, the transverse mass would 
provide a sharp endpoint to the spectrum as in the case of single-W 
production. It still turns out to be useful to define the transverse mass 
in a similar way as for single-W production: 


ma, = (Et + poe = (pit $ piss)? (12.20) 


where p% is the total momentum in the transverse plane of the two 
charged leptons, B® and ET are respectively the magnitude of the 
missing transverse momentum and the missing transverse momentum 
vector, and 


(Ep)? = (pt)? +m (12.21) 


where my is the invariant mass of the two charged leptons. It turns 
out that the distribution of mp has an endpoint at the value of the 
invariant mass of the Higgs boson. However, the effective mass reso- 
lution is only ~20%. There are significant reducible backgrounds from 
W + jet events with the jet faking a lepton and from Drell-Yan pro- 
cesses (see Chapter 9) with additional jets (e.g. ebe~ jet jet or ytu” jet 
jet) with fake missing transverse energy from mismeasurements. There- 
fore, in ATLAS, only events with one W decaying to eve and one 
decaying to pv, were considered. There remains an irreducible back- 
ground from WW production. In order to reduce this background on 
a statistical basis, various cuts are used. For example, for WW aris- 
ing from the decay of a spin-0 Higgs, the spins of the two Ws must 
point in opposite directions and the two charged leptons will tend to 
have momenta in similar directions (see Exercise 12.11). The irreducible 
background from WW production (see Fig. 12.14 for an example Feyn- 
man diagram for WW production) must be determined very precisely 
in order to be able to detect a significant excess from Higgs events. 
This is done by selecting events with different kinematic cuts to obtain 
background-dominated samples. This is used to fix the normalization 
of the background. The ratio of the number of expected background 
events in the signal region to the number of events in the background 
(‘control’) region is taken from Monte Carlo simulation. The transverse 
mass distribution measured by ATLAS is shown in Fig. 12.15 [26, 30] and 
compared with the expectations from background and a SM Higgs signal 
for my = 126 GeV. 


12.6.4 Statistical significance 


The relatively small signals in all channels and the non-negligible back- 
grounds mean that a careful assessment of the statistical significance is 
required before any claims about a discovery can be made. The technique 
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q q 
W WwW 
Z/y 
Ww W 
q q 


Fig. 12.14 An example Feynman 
diagram at the quark level for WW 
scattering. 


Fig. 12.15 Distribution of the trans- 
verse mass mr from the ATLAS experi- 
ment [26] in a sample of candidate WW 
events. The expected background and 
signal for the case my = 126 GeV are 
also shown. 
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16py ‘shape’ we mean the shape of the 
distributions for the relevant variables. 


It practice, we use a ‘binned’ likeli- 
hood, in which the events are binned in 
the relevant variable(s) and the product 
is over the bins. In addition, for ease 
of computation, we work with a ‘log- 
likelihood’ In L(1) = 90, In p(i | u, O). 


181 order to simplify the explanation, 
we now ignore the nuisance parameters. 


19Tn the frequentist interpretation of 
statistics, the statement that the SM 
Higgs is excluded at a given value of 
my at 95% confidence level means 
that if there were a SM Higgs at this 
mass and the experiment were repeated 
many times, then in 95% of the cases a 
larger value of the test statistic would 
be obtained. 


used is based on maximum -likelihood fits. Probability density functions 
(PDFs) are computed assuming there is a signal with the same shape?® 
as the SM Higgs but scaled by a normalization constant p (i.e. u = 1 cor- 
responds to the SM Higgs expectation and u = 0 corresponds to a pure 
background distribution). The PDF also depends on many ‘nuisance’ 
parameters (such as uncertainties in the detector response to different 
particles). The likelihood of a sample?” is defined as 


L(u) = JJ pli 1.8) (12.22) 


where the product runs over all events in the sample and p(i |u, @) is 
the probability of observing event 7, assuming a particular value of the 


parameter u and for some set of nuisance parameters 0. The test statistic 
is defined as!§ 


(12.23) 


eee ey 


L(data | Ê) 


where {i is the value obtained by maximizing the likelihood by varying 
the value of u in the denominator in eqn 12.23 and the value of u in the 
numerator is fixed to a particular value depending on what statistical 
test is being performed. We can then define two probabilities: 


e For signal plus background, py = P(G, > @?*| 1). 
e For the background only, 1 — pp = P(G, > @>*| u = 0). 


Finally, we define the ratio 


Pu 

CLs(H) = 7> 5 
The 95% confidence level (CL) limit on py is found by adjusting u until 
CL; (u) = 0.05. This procedure is carried out for a range of values of my 
and we can state the SM Higgs boson is excluded at 95% confidence level 
for a particular value of my if w(95%CL) < 1.!° The resulting exclusion 
plot from ATLAS is shown in Fig. 12.16(a) [26]. The SM Higgs is ex- 
cluded over the ranges 111-121 GeV and 131-559 GeV. The reason why 
there is a gap in the exclusion range between 121 and 131 GeV is that 
there is evidence for an excess over the SM background. The probability 
of this excess can be quantified by the probability po that a background- 
only hypothesis could generate a larger value of the test statistic q, for 
a particular value of my. The resulting po versus my plot for ATLAS 
is shown in Fig. 12.16(b). The minimum po value corresponds to a 6c 
fluctuation for a Gaussian distribution. However, it is important to allow 
for the ‘look-elsewhere effect’; the po value corresponds to the probabil- 
ity of observing a larger fluctuation at a particular value of my, but 
we could have seen an excess over a range of values for my. Therefore, 
the probability of observing an equal excess over the full mass range 
is larger by a factor N ~ Amy/o(m#), where Amy is the range in 
my and o(Myz) is the resolution in the event-by-event measurement. 


(12.24) 
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Allowing for this effect for a search over the allowed range from previous 
experiments at LEP and the Tevatron (110-150 GeV), the statistical sig- 
nificance is reduced to 5.30.2 A very similar result was obtained from 
the CMS experiment [64].*! Finally, one can ask if the observed excess is 
consistent with the SM Higgs. This is addressed in Fig. 12.16(c), which 
shows the 95% confidence level on the signal strength parameter p as 
a function of my. This shows that the result is consistent with the SM 
Higgs expectations for a Higgs with my = 126 GeV. 


12.7 Coupling to fermions 


The SM Higgs mechanism generates masses for fermions as well as for 
bosons. Therefore, the Higgs boson should have decay modes to fermions 
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Fig. 12.16 (a) 95% confidence limit on 
the SM Higgs. (b) Local po values as 
a function of my. (c) Variation of fit- 
ted signal strength u with my from the 
ATLAS experiment [26]. 


20The look-elsewhere effect only ac- 
counts for the range of mass for this one 
study. Since experiments make many 
independent measurements, we want 
to minimize the probability of falsely 
claiming a discovery, so a high thresh- 
old in significance is used before claim- 
ing a discovery. Conventionally, this is 
taken to be 5c. 


21 Consistent results were obtained by 
the Tevatron experiment at the 30 
level, mainly in the bb channel. 
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2? This decay mode should be observ- 
able at the LHC in the VH production 
mode, where V = W= or Z. However, 
as this production mode has a smaller 
cross section, it will need larger data 
samples than are currently available (as 
of 2014). 


23 The backgrounds are so large that 
simply making a sequence of cuts to 
optimize signal and reduce background 
would not be sufficient to reveal a 
signal, and so a more sophisticated 
multivariate analysis is required (see 
Chapter 8). 


Fig. 12.17 Distribution of the best es- 
timate for the Higgs boson mass for 
data and SM backgrounds for candi- 
date rr events. All events have been 
weighted by a factor ln(1 + S/B), 
where S/B is the expected signal-to- 
background ratio [28]. 


that grow with the mass of the fermion. As the Higgs boson is too light 
to decay into tt, the most massive fermion available for this process is 
the b quark and hence there is a large branching ratio for H — bb. 
However, this decay mode is extremely difficult to study because of 
the very large QCD production of bb.?? The next heaviest fermion in 
the SM is the 7 lepton and there is a significant branching ratio for 
H — 77 (see Fig. 12.8). The identification of T leptons is much harder 
than the identification of e or u. There are many backgrounds to con- 
sider. The H — 77 channel has an irreducible background from Z — rr. 
Separation between the signal and this background can be achieved by 
reconstructing the mass of the Higgs boson, but the mass resolution 
is severely degraded by the presence of multiple neutrinos in the final 
state (see Section 12.6.3). The leptonic decay modes of the 7 have smal- 
ler branching ratios than the hadronic modes. However, the hadronic 
decays of the 7 have very large backgrounds from QCD jets that can 
be misinterpreted as Ts. Some separation between hadronically decaying 
rs and QCD jets is obtained by counting the number of charged tracks 
in a narrow cone around the 7 candidate. The number of tracks from T 
decays is usually one or three, whereas QCD jets have a higher average 
charge multiplicity. Additional separation between 7 hadronic decays 
and QCD jets is provided by measurements of the shower profile in the 
calorimeter, since the rs will tend to produce relatively narrow jets.?? 
Although the invariant mass cannot be reconstructed, a best estimate 
of the mass can be made (see Exercise 12.10 for a simpler method to 
estimate the Higgs mass in 7 decays). The mass distribution for the data 
and all the SM backgrounds are shown in Fig. 12.17 and a peak above 
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the SM background can be seen. The peak is consistent with the SM 
expectation and provides evidence at the 4.1a level for this decay mode 
of the Higgs boson [28]. The CMS experiment also found evidence for 
this decay mode [66]. 


12.8 Determination of the spin and parity 
of the new boson 


One fundamental prediction of the SM is that the spin/parity of the 
Higgs boson should be JP = 0+. Since the new boson decays to two 
bosons, it must have integer spin. The observation of the decay mode 
H — yy excludes the spin-1 hypothesis [95]. The SM predictions can be 
compared with models in which the boson has alternative spin/parity 
assignments. For the yy decay mode, we can compare the SM with pre- 
dictions for graviton-inspired models”* with J? = 2+. We can define the ?*The graviton is a hypothetical par- 
angle of the photons relative to the direction of the Higgs boson (6*). ticle that is believed to be the quantum 
The distribution of cos@* should be isotropic for the case of a spin-0 i the- gřavitatioñal. field. Dae Bia: 
on must have spin 2 to correspond to 
Higgs. However, in the case of a boson with J? = 2+, the distribution the classical theory of general relativity, 
can be forward-peaked. The angular distribution of the observed events in which the gravitational interaction 
is biased by the acceptance of the detector and the selection required to arises from the stress-energy tensor, 
isolate the signal above the background. However, there are still signifi- Which ig o second-tank ener: 
cant differences between the predictions of the two models, and the data 
can be used to discriminate between them. The measured distribution of 
cos 0* is compared with the ATLAS data [29] in Fig. 12.18. Information 
on the spin/parity can also be obtained from the ZZ* decay mode, al- 
though the errors are still quite large because of the limited number of 
events in the current data sample. The WW decay modes for the case 
of a spin-0 Higgs should result in correlations between the azimuthal 
angle of the charged leptons from the resulting decays of the W bosons 
(see Exercise 12.11). Combining the analyses from three decay modes, 
all the results are consistent with the SM quantum numbers of J? = 0+ 
and alternative models can be excluded at a range of confidence levels 
from 97.7% to 99.9% [28]. Results of a similar analysis from the CMS 
experiment [65] in the H — ZZ decay mode also favour the SM and 
disfavour alternative models. 
A clear summary of the Higgs boson coupling measurements from the 
CMS experiment [67] is shown in Fig. 12.19. This shows the measured 
strength of the Higgs boson coupling to fermions (A) or bosons (g) as a 
function of the mass of the particle. For the top quark, the decay H — tt 
is kinematically forbidden because of the large mass of the top quark. 
However, we can still determine the coupling of the top quark to the 
Higgs boson, because the dominant production mechanism is via a top- 
quark loop (see Fig. 12.7). In the SM, we expect the coupling strengths to 
increase with the mass of the particle. The measured coupling strengths Bae oe ace. cas Ge S 
show the expected increase with particle mass as expected in the SM. u Ae decay mode and. thus provide 
However, the errors in some of the measurements are large and more an additional data point at much lower 
data will make this test more powerful.?5 mass. 


25 Eventually, there should be sufficient 
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Fig. 12.18 Angular distribution of 
cos #* measured by ATLAS [29] com- 
pared with two hypotheses for the 
spin/parity: (a) J? =0+; (b) JP =2*. 


Fig. 12.19 Measured values of the 
Higgs boson coupling to fermions (A) 
or bosons (g) as a function of particle 
mass from the CMS experiment [67]. 
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12.9 Outlook 


The results already obtained provide convincing evidence for the exist- 
ence of a new boson (it decays to yy, so it must have even spin).?° The 
study of the spin/parity gives results consistent with the SM expectation 
of 0* and other possibilities are excluded. The properties appear to be 
consistent with the SM Higgs, but this is just the beginning of a new 
chapter in physics, which will involve many more detailed measurements: 


(1) a precise measurement of the rate to check compatibility 
with the SM; 


(2) measurements of ratios of branching ratios for as many decay 
modes as possible; 


(3) measurements of the ratio of production cross sections for gg 
processes and vector-boson fusion (e.g. qq —> qqWW — qqH); 

(4) measurements of high-energy longitudinally polarized WW scat- 
tering; this process is sensitive to additional scalars in the theory 
or to the possible composite nature of the Higgs boson; 


(5) Higgs self-interactions, i.e. production of two Higgs bosons in one 
event. 


The first three items will be studied with new data in the next few years 
at the LHC. The last two items are more challenging and will require 
a further upgrade to the LHC luminosity.?’ In the SM, there is only 
expected to be one Higgs boson, but in many Beyond the SM (BSM) 
theories, there can be more. For example, in the minimal supersymmetric 
extension of the SM (MSSM), there should be five Higgs bosons (see 
Chapter 13). Therefore, searching for additional Higgs bosons will clearly 
be a vital part of the future LHC physics programme. 


Chapter summary 


e The Higgs mechanism allows for the generation of mass by spontaneous 
symmetry breaking, without violating the SU(2) x U(1) gauge symmetry. 

e There is compelling evidence for the existence of a new boson at a mass 
of 125 GeV, with properties consistent with those expected from the SM. 

e The measurements of the spin/parity favour the 0* assignment expected 
in the SM. 

e Many more measurements will be required to confirm that the new boson 


is indeed the SM Higgs boson. Searches will be made for additional Higgs 
bosons as expected in theories like supersymmetry. 
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©The yy decay mode excludes spin 1. 


2TSuch an upgrade to the LHC is 
planned for 2024. This will also require 
extensive upgrades to the ATLAS and 
CMS detectors to cope with the higher 
luminosity. 
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Further reading 


e Griffiths, D. (2008). Introduction to Elementary Par- 
ticles (2nd revised edn). Wiley-VCH. This gives a more 
advanced treatment of the Higgs mechanism. 


e Burcham, W. E. and Jobes, M. (1994). Nuclear 
and Particle Physics. Pearson. This gives a fuller 


but accessible account of the Higgs mechanism in 
the SM. 

e Aitchison, I. J. R. and Hey, A. J. G. (2013). Gauge 
Theories in Particle Physics, Volume 2 (4th edn). 
CRC Press. This is a thorough and more advanced 
graduate-level treatment of the Higgs mechanism. 


Exercises 


(12.1) Apply the gauge transformations for the A, ¢, 
and w fields in eqn 12.3 and hence verify that the 
Dirac equation is gauge-invariant. 


(12.2) Evaluate the potential and kinetic energy terms 
for a spherical pendulum and hence verify the 
form of the Lagrangian in eqn 12.7. Using the 
Euler-Lagrange equations and this Lagrangian, 
determine the equations of motion and check that 


they agree with eqn 12.8. 


(12.3) 


A Taylor series. Show that the following three 
expressions are identical: 


The second and third expressions are in the form 
of Taylor series of the first about the points 
x = l and x = -1, respectively, which are the 
two minima of the function. The series terminate 
(why is this?) and so we end up with terms only 
up to the fourth order. 


(12.4) Verify that applying the Euler-Lagrange equa- 
tion to the Lagrangian for a spin-0 particle 
(eqn 12.11) gives the Klein—Gordon equation. 
Note: If your answer has an unwanted factor 2, 
check by first expanding the (0,.¢)(0,¢) term into 


the four components. 


(12.5) 


(12.6) 


(12.7) 


(12.8) 


(12.9) 


Consider the Lagrangian given by 
L=—(0" A” — OY A") (ð Av — OVA, ) 


Apply the Euler-Lagrange equations to show that 
the equations of motion are 0,F"” = 0, where 
FRY = OF A” — 0” A”. Comment on the relation 


of this result to classical electromagnetism. 


Consider the potential energy function given by 


eqn 12.4. Show that ¢4 


t y/X correspond to the 


minima of the potential and ¢ = 0 is the max- 


imum. Let p = ¢4 


t u/A to express the potential 


as a function of p. Show that this function now 
has a positive mass term and determine the 
value of the mass. We now have a p? term. 
Draw the corresponding Feynman diagram. We 
obtain a similar term in the full Higgs the- 
ory; explain the significance in terms of di-Higgs 
production. 


Figure 12.6 shows an enhancement of the Higgs 
boson production cross section as a function of 
mass at a mass around 350 GeV. Give a simple 
explanation for this. 


Figure 12.8 shows the branching ratio as a func- 
tion of Higgs mass for Higgs boson decays to ZZ. 
Give a simple explanation for the behaviour as a 
function of Higgs mass. 


The expected background in the four-lepton in- 
variant mass (see Fig. 12.13) shows peaks around 
90 and 200 GeV. Give a qualitative explanation 
for these peaks. Using the data in Fig. 12.13 
and the theoretical value for the branching ra- 
tios (see Fig. 12.8), make an order-of-magnitude 


(12.10) 


estimate of the Higgs boson production cross 
section. Explain whether your estimate would be 
an underestimate or an overestimate. 


This question looks at the reconstruction of the 
Higgs boson mass in H — rr decays in the col- 
linear approximation. The momenta of the neu- 
trino(s) from each 7 decay are assumed to be 
parallel to each 7. Show that if the angles in 
the transverse plane for each 7 are measured, 
and the transverse momentum of the recoiling 
hadronic system (R) is measured, we can de- 
termine the transverse momenta of the two Ts. 
What other parameters must be measured in or- 
der to determine the invariant mass of the Higgs 
boson? Why does this method break down if 


(12.11) 
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the Higgs boson is produced with no transverse 
momentum? 

Hint: It is sufficient to consider the special case 
in which the two rs decay along the y axis. 


Consider the decay of a spin-0 Higgs to WT W 
with subsequent leptonic decays of the Ws. Let 
A@ be the azimuthal angle between the two 
final-state leptons. Draw diagrams to illustrate 
the possible polarization states of the Ws. What 
would be the most probable decay angle of the 
leptons relative to the spin of the Ws? Hence ex- 
plain why the distribution of A¢ġ should peak at 
small values. 

Why do the experiments use the relative angle 
between the leptons in the transverse plane (Ag) 
rather than the space angle? 
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192 is the scale of the process; e.g for 


W production, Q? = M2. 
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LHC and BSM 


Now that there is growing evidence for the discovery of the Higgs boson, 
there are no missing particles in the Standard Model (SM). There are 
still many details to be checked to be sure that the new boson discovered 
at the LHC (see Chapter 12) is compatible with the SM Higgs boson. 
Even if this turns out to be the case, however, serious problems remain 
with the SM. A brief review of these issues, particularly the hierarchy 
problem, will be given in this chapter. We will justify the claim that we 
should expect to see new Beyond the Standard Model (BSM) physics 
in the TeV energy range. This then provides one of the two principal 
motivations for the LHC (the first being understanding the origin of 
mass), and we will look at how LHC experiments are searching for new 
physics. The study of the Higgs boson could be extended at a future 
linear ete~ collider (a ‘Higgs factory’). Looking further into the future, 
we will see how a high-energy linear ete~ collider could probe deeper 
into any new physics that might be discovered at LHC, if it were at an 
accessible energy. 


13.1 LHC and Standard Model physics 


The main motivation for the LHC is to understand the origin of mass and 
to search for new physics at the TeV scale. The origin of mass has been 
discussed in Chapter 12, so we will now consider BSM physics. However, 
any new physics channel will have some SM physics backgrounds, so 
it is essential to check that the SM works in the new region of phase 
space opened up by the LHC. This is non-trivial because, even with the 
LHC operating at 8 TeV CMS energy (rather than the design value of 
14 TeV), high-Q? processes such as W/Z production! sample very much 
lower values of the parton x distribution than previous experiments (see 
Chapter 9). This is illustrated in Fig. 13.1, from which one can see the 
large increase in the range of x that can be studied at high Q? at the 
LHC, compared with previous accelerators [115]. 

The predicted cross sections (see Chapter 9 for an explanation of how 
these calculations are performed) are shown as a function of CMS energy 
in Fig. 13.2. There is an enormous spread in the magnitudes of the cross 
sections for SM processes and most of them are much larger than those 
expected for Higgs production or for new physics processes. 

This raises the critical question of how one can trigger and identify 
the interesting events in the presence of these enormous backgrounds. 


Particle Physics in the LHC Era, Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg. © Giles Barr, Robin Devenish, Roman Walczak, 
& Tony Weidberg 2016. Published in 2016 by Oxford University Press. 
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In 2012 LHC, operation, the luminos- 
ity was slightly lower than the design 
value, but the b.c. spacing was 50 ns, 
so the average number of interactions 
per bunch crossing was ~30. 


13.2 LHC triggers 


The scale of the problem is set by the magnitude of the total cross 
section. At the LHC design luminosity of 10°4cm~?s~', the total inter- 
action rate is ~1 GHz. Each proton beam is composed of ~2000 bunches 
of protons. Each bunch contains ~10" protons. During nominal LHC 
operation, the bunches collide every 25 ns and there are an average of 24 
interactions per bunch crossing (b.c.). The general principles of pipe- 
lined triggers were briefly reviewed in Chapter 4. Most of the triggers 
are based on identification of objects like electrons, photons, muons, or 
‘jets’ at high transverse momentum pr. An example of a first-level (L1) 
trigger in ATLAS is the e/y trigger, which uses the fact that electrons 
give very localized energy deposition in the electromagnetic (EM) cal- 
orimeter, whereas the background processes arise from QCD jets, which 
consist of mixtures of ys (mainly from 7° decays) and hadrons. The L1 
e/y trigger selects candidates by requiring localized energy in the EM 
calorimeter and vetoing events with too much energy in the hadronic 
calorimeter behind the candidate e/y object or in the surrounding cells 
of the EM calorimeter. The higher-level triggers can utilize data from 
the tracking detector to provide further background rejection. A genu- 
ine electron will have a track with a transverse momentum and direction 
compatible with the energy deposition in the EM calorimeter, unlike the 
background from QCD jets. The fine granularity of the EM calorimeter 
is also used to provide more background rejection because EM showers 
are narrower than hadronic showers (see Chapter 4). 

The very large trigger rejection must be achieved while still maintain- 
ing high efficiency for the interesting objects like electrons and muons. 
This raises the crucial question as to how one can determine if the trigger 
system is working efficiently or not. In general, this can be done using 
a ‘tag-and-probe’ analysis. For example, Z — e+e~ events can be trig- 
gered with a single electron trigger on one of the electrons in the event. 
One of the electrons (we do not distinguish between electrons and posi- 
trons for this analysis) has a reconstructed electron passing the electron 
selection and matching in the detector location with the trigger, and 
this electron serves as the ‘tag’ for the event. We can then look for a 
second electron in the event that also passes the electron selection, and 
this electron serves as the ‘probe’. We can examine the trigger data and 
determine if this electron also passed the electron trigger. Let ntag be 
the number of events in which an event is tagged and nprobe the number 
of events in which the second object, the ‘probe’, also passes the trigger. 
The trigger efficiency can be determined to be (see Exercise 13.1) 

ce = —ZMprobe _ (13.1) 

Nprobe + Ntag 
The beauty of the tag-and-probe method is that it provides a purely 
data-driven efficiency determination and so does not rely on any assump- 
tions about the detector performance that are needed for a Monte Carlo 
simulation. This method can be extended from a ‘global’ measurement 
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to a differential one in which the efficiency is measured as a function of 
variables like the transverse momentum pr.’ 

We now consider the critical issue of how it is possible to search for 
very rare processes, despite the large backgrounds. The processes with 
the largest cross sections result in particles with limited transverse mo- 
mentum and can easily be rejected by any trigger requiring a high-pr 
object. For example, jet events can be studied by triggering on large 
transverse energy in a localized region of the calorimeter. In order to 
trigger on rarer processes such as W and Z production, one can trig- 
ger on leptons. For the very small cross sections for processes like Higgs 
production, triggers on multiple objects can be used to further reduce 
the trigger rate. In addition to triggers on localized objects, there is also 
a very important trigger on ‘missing transverse momentum’ to detect 
weakly interacting particles like neutrinos. Let the energy measured in 
a calorimeter cell (assuming massless energy deposits, so E = p) be 
E; and let the polar and azimuthal angles be 6; and ¢;. The measured 
momentum balance in the plane perpendicular* to the beam axis (z) is 
defined by 


Et, =- X E; sin 0; cos ¢; 
i 


l (13.2) 
Epis = — Ņ_ Eisin 0i sin Q; 


and the magnitude of the missing transverse momentum is given by 


ER = 4 / (EP)? + (Eps)? (13.3) 
This EX'S trigger is useful for selecting events with neutrinos but is also 
essential for searches for BSM physics such as supersymmetry that have 
weakly interacting particles (see Section 13.4.1). 


13.3 SM measurements at the LHC 


Many measurements of different SM physics processes have been per- 
formed at the LHC. This section will give a brief summary of a few of 
these studies. The largest cross sections for high-pr processes are for 
dijet production, since this is governed by the strong interaction. The 
data for the distribution in jet transverse momentum are compared with 
next-to-leading-order (NLO) QCD predictions [24] in Fig. 13.3.5 There 
is very good agreement between data and the QCD prediction up to 
pr ~ 1TeV. This very impressive agreement spans a dynamic range 
of 10 orders of magnitude. As the jet cross section is a steeply falling 
function of the jet transverse energy pr, a relatively small error in the 
pr measurement will result in a large error in the cross section. It is 
therefore essential to make an accurate calibration of the jet energies. 


3The same tag-and-probe technique 
can also be used to determine efficien- 
cies for offline electron identification. 
The methodology is very powerful since 
it can be extended to any process that 
provides two independent objects to 
select. 


4In the beam direction, so much mo- 
mentum is carried by particles that 
are ‘lost’ down the beam pipe that we 
cannot make a useful measurement. 


>The QCD calculations are performed 
using perturbation theory, utilizing the 
fact that at large transverse momen- 
tum, Q?, the strong coupling constant 
as(Q?)<1. The NLO calculation in- 
cludes Feynman diagrams with one ex- 
tra power of as(Q?) than the leading- 
order diagram. 
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Fig. 13.3 Jet cross 
transverse energy, measured by the AT- 
LAS experiment at the LHC. The meas- 
urements are shown in slices of the 
rapidity variable 

— 1 E+ pz 
T 2 E-— Pz 
where F is the energy and pz is the 
momentum component along the beam 
direction for the jet. The data are com- 
pared with the predictions from NLO 
QCD (light-shaded bands at each data 
point) [24]. 


section versus 


The main source of photons is from 
the decays of particles like 7° and 7. 


T Underlying’ event refers to the inter- 
actions of the spectator partons left 
behind after the hard parton—parton 
interaction. 


8‘Pile-up’ refers to additional pp col- 
lisions that occur in the same bunch 
crossing as a triggered event. 


PAR = (Ad? + An”), 
pseudorapidity 7 = —In(tan $0) and 
6 and ¢ are the polar and azimuthal 
angles of the cell. 
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Jets are composed of hadrons as well as photons. The energy determin- 
ation of hadrons in a calorimeter is more difficult than that of electrons 
or photons (see Chapter 4). In addition, there are uncertainties in the 
reconstruction of the energy of a hadron jet, because any jet-finding al- 
gorithm has to determine which energy depositions should be considered 
as part of the jet. Therefore, there will be energy depositions that fail 
to be counted as part of the jet and some energy depositions from the 
‘underlying’ event that are wrongly attributed to the jet.” At high lu- 
minosity, there is also the problem that some energy from the ‘pile-up’ 
events will also be wrongly associated with the jet.® 

The simplest jet finder is based on the ‘cone’ algorithm. The highest- 
pr cell in the calorimeter is used as a ‘seed’ and the nearest cell with 
transverse energy above some fixed threshold and within a radius AR 
less than a fixed size is found.? An updated value of the jet direction is 
computed as the energy-weighted average of the two cells. The procedure 
is then iterated until no more cells are found to merge. The next jet 
is found starting from the seed of the remaining cell with the highest 
transverse energy. The procedure terminates when there are no unused 
cells above threshold. 

While the algorithm is simple from an experimental point of view, 
it has major theoretical problems that make comparisons with QCD 
predictions difficult. Consider an event with two jets that are separated 
and should be reconstructed as two separate jets. If there is a ‘soft’ 
particle between the two partons, this can cause the two jets to merge. 
This makes the results very sensitive to low-energy radiation, and the 
algorithm is described as not being ‘infrared-safe’. 

Several infrared-safe algorithms have been proposed based on sequen- 
tial recombination. We define a distance d;; between cells or jets i and j. 
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We define a similar distance dig from a cell or a jet to the beam. The 
clustering starts with the smallest distance dij or dig. If dij < dip, it 
combines the entities ¿ and j, otherwise the entity is called a jet and its 
cells are removed from the list. The algorithm is iterated until there are 
no more jets to be found. 

Different algorithms use different definitions of distance. The most 
commonly used algorithm is called the anti-kr jet finder [58]: 


dij = min { (p71 Prj) (ARy/R)’} 


(13.4) 
dig = Dri 
where AR is the same separation in rapidity and azimuthal angle as 
used for the cone algorithm. and pr, is the transverse momentum. R is 
a fixed radius in (y, ) space (similar to the fixed cone radius).1° 
There are many contributions to the uncertainties in the measured 
energy of jets, so it is vital to use data-driven techniques to calibrate 
the jet energy scale. The following are some of these techniques: 


e pr balance in y—jet events. As the y transverse energy can be 
measured very accurately in the EM calorimeter, the ratio pt®* /p% 
can be used to calibrate the jet energy scale. An example of a 
calibration plot from this analysis [37] is shown in Fig. 13.4. 


e A similar technique can be used in events with Zs recoiling against 
a hadronic jet. The well-measured transverse momentum of the Z 
can be used to calibrate the jet pr. 
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10 Typical values are in the range AR = 
0.4-0.7. 


Fig. 13.4 Jet energy calibration in 
ATLAS using the y—jet balance tech- 
nique [37]. (a) Ratio of jet to photon 
transverse energies for data and Monte 
Carlo simulation. (b) Ratio of the val- 
ues from data and Monte Carlo calcu- 
lations (‘PYTHIA’). Note the different 
scales for the two plots. 
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This is particularly useful because 
there are too few events at very high 
pr to use the absolute calibration tech- 
niques. As multijets are produced by 
purely strong interactions, the rates are 
higher than for y—jet or Z—jet events. 


e Dijet balance: in two-jet events, the pr of the two jets should be 
equal in magnitude. This cannot provide an absolute calibration, 
but it is a powerful tool for inter-calibration of different regions of 
the calorimeter. 


e Multijet balance: in three-jet events, if the calibration factors for 
the two lowest-py jets are known, the pr balance can be used to 
calibrate the highest-pr jet. 


Production of W and Z bosons with subsequent leptonic decays pro- 
vides very clean channels for comparisons with QCD predictions. The 
first comparisons we consider are measurements of the cross sections for 
these processes. The methodology for the theoretical calculation in the 
parton model has been described in Chapter 9. For an ideal detector with 
100% efficiency and no background, the measured cross section would 
simply be related to the number of events observed, Nops, and to the 
integrated luminosity L by 


Nova = Lo (13.5) 


In a real detector, we have to subtract the estimated number of 
background events, Npxea- We need to correct for the finite detector 
efficiency e. Finally, we also need to allow for the ‘acceptance’ A, which 
gives the fraction of produced events for which the particle(s) are inside 
the angular and momentum range that can be detected. Therefore, we 
modify eqn 13.5 to give 


Nobs — Nobkga 
C= ae (13.6) 

For these SM measurements, the backgrounds are generally small, so 
we will postpone a discussion of how to determine them until we consider 
searches for new physics (see Section 13.5). In principle, the efficiency 
could be determined purely from a Monte Carlo calculation. However, we 
can greatly reduce the uncertainty by also using ‘data-driven’ measure- 
ments based on the tag-and-probe technique (see Section 13.2). However, 
the acceptance involves events that are not detectable for a given de- 
tector, so we have to rely on Monte Carlo calculations. The Z bosons 
are identified by detecting two oppositely charged leptons. In the case of 
electrons and muons, the invariant mass of the pair can be reconstructed 
and a very clean Z mass peak [22] is observed, as shown in Fig. 13.5 for 
the case of Z > ete-. 

The W — eve and W — wy, decays can be separated from the 
background due to dijet events by identifying a well-measured high-pr 
electron or a muon and requiring a large value of the missing transverse 
momentum (see Section 13.2) to identify a neutrino. Nearly all hadrons 
will be stopped in the calorimeters, so muons can be identified by finding 
tracks in the muon chambers behind the calorimeters. A very clean peak 
in the missing-transverse-momentum distribution is observed above the 
background [63], as shown in Fig. 13.6. 


(b) 


Entries / (1 GeV) 


1600 


1400 


1200 


1000 


800 


600 


400 


200 


Number of events / (2 GeV) 


SE ER RS | 
F e data 2010 (YS = 7 TeV) JL dt = 36 pb"! 
[| |Z—>ee J 
E | [QCD J 
h oe ee EERE EEEREN A 
70 75 80 85 90 95 100 105 110 115 
Mee (GeV) 
x108 
rer CEEA EE pe rr yp rrr 
36 pb at Vs=7 TeV J 
© data 
W—yv 
E EWK+t 
E QCD 


0 20 40 60 80 100 


a(Wt > Itv) 
a(Wt+ — It+v) 4 


13.38 SM measurements at the LHC 363 


Fig. 13.5 ete~ invariant mass spec- 
trum measured by the ATLAS experi- 
ment at the LHC [22]. 


Fig. 13.6 (a) Distribution of miss- 
ing transverse momentum Emiss for 
W — pv measured by the CMS experi- 
ment at the LHC [63]. The data are 
shown with error bars and the back- 
grounds are shown for other SM sources 
of genuine muons and for misidenti- 
fied muons from QCD jets. The dot- 
ted histogram gives the fitted signal. 
(b) Fractional deviation (data — fit)/ 
error. 


The cross section for W production at LHC [22] is compared with 
lower-energy pp data and theoretical predictions in Fig. 13.7. 

In order to compare theory with data, the calculations are shown for 
both pp and pp. The theoretical predictions are in good agreement with 
all the data. An interesting and more differential test of the SM at LHC 
is given by measuring the charge asymmetry in W production: 


(13.7) 


At the LHC, Wt (W7) will be produced mainly by a valence u (d) quark 
colliding with a d (a) quark from the sea (see Chapter 9). Therefore, if 
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Fig. 13.7 Cross section times branch- 
ing ratio for W production as a function 
of CMS energy for pp and pp [23]. 


12 These results together with other SM 
measurements at the LHC can be used 
to significantly reduce the uncertainty 
in the parton distribution functions at 
high values of Q?. 
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we assume that the sea distributions are the same for u and d quarks, 
we should expect 


Aw Do (13.8) 


We can determine the pr of the v using the missing-transverse- 
momentum measurement, but we cannot determine the longitudinal 
momentum of the neutrino. Therefore, as the 4-momentum of the W 
cannot be uniquely reconstructed, it is convenient to measure the asym- 
metry as a function of an angular variable. The chosen variable is the 
pseudorapidity 7 = —In(tan $0), where @ is the polar angle. In pp inter- 
actions, the asymmetry is identical for positive and negative 7, so the 
distribution is symmetric about 7 = 0. At 7 + 0, the parton momentum 
fractions of the two protons will tend to be similar. So typical x values 
will be given by Mw/,/s ~ 0.01 (see Exercise 13.2). For larger values 
of 7, the parton momenta become more unequal and so the asymmetry 
becomes sensitive to the ratio u(x)/d(x) at larger values of x. From 
the knowledge of the parton distribution functions (see Chapter 9), we 
know that u(x)/d(x) increases with x at high x, and, for x > 0.01, 
u(x)/d(x) > 1. However, at very large values of 7, the effect of the 
angular distribution of the decay of the W (see Chapter 8) starts to 
dominate and this generates an asymmetry with the opposite sign. We 
therefore expect that the asymmetry should be positive and increasing 
with 7 over a limited range and then decrease with 7 in the very forward 
region. These general features agree with the data from ATLAS, CMS, 
and LHCb [21], and an NLO QCD prediction (see Chapter 9) is in good 
agreement with the data, as shown in Fig. 13.8.1? 
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13.3.1 Top-quark production 


Another important test of the SM at LHC is top-quark production. 
The cross section for tt production at a CMS energy of 7 TeV (LHC in 
2011) is predicted to be 177.31191 pb.!3 The leptonic decays of the t 
or t will result in events with multijets, lepton(s), and missing trans- 
verse momentum from the neutrino(s). These events therefore represent 
potentially very large backgrounds for new-physics searches based on 
missing-transverse-momentum signatures. Many measurements of tt pro- 
cesses have been made. The cleanest channels are those with both top 
quarks decaying semileptonically, t — blv. The largest background pro- 
cess is Z + jets. The Z + jets background can be suppressed by removing 
events with the same flavour, opposite-sign lepton pairs with an invari- 
ant mass consistent with the mass of the Z. The signal events have two 
neutrinos, which will result in missing transverse momentum (E75, see 
Section 13.1). The Z + jets background will only have non-zero Em'ss 
because of the finite detector resolution. Therefore, this background can 
be further suppressed by requiring a large value for E®'ss, The signal 
events will always contain two b quarks, so the purity can be enhanced 
using ‘b-tagging’ based on the relatively long lifetime of b hadrons. A like- 
lihood'* for a given jet to contain a b hadron is constructed based on 
several variables sensitive to lifetime, including the following: 


e The impact parameter (see Chapter 8) divided by its error (called 
the ‘significance’) for tracks in the jet. 


e For each ‘secondary’ vertex, we can determine the decay length 
from the ‘primary’ vertex. We then define the decay length sig- 
nificance for any secondary vertices associated with the jet as the 
decay length divided by its error. 


A cut on the magnitude of the likelihood is made so as to obtain a 
b-tagging efficiency of ~80%. This technique obviously requires very 
precise tracking detectors and in particular a very high-precision silicon 
pixel detector as close to the beam line as possible. The ‘irreducible’ 


Fig. 13.8 W charge asymmetry as a 
function of lepton pseudorapidity as 
measured by ATLAS, CMS, and LHCb 
at LHC [21]. 


13-The top cross section for pp at a CMS 
energy of 7 TeV (LHC in 2011) is a fac- 
tor of ~25 greater than that for pp at 
2TeV (Tevatron). At the Tevatron, at 
the typical value of x ~ 0.2, the cross 
section is dominated by qq processes. 
At the LHC, the smaller x values mean 
that the cross section is dominated by 
gg processes. 


M41 the probability of a given measure- 
ment i under a given hypothesis is p;, 
the likelihood is constructed from a set 
of N measurements as L = Te, pi. 
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Fig. 13.9 Jet multiplicity distribu- 
tions for dilepton events in ATLAS for 
(a) all jets and (b) b-tagged jets [25]. 


Fig. 13.10 Measurements of the tt 
cross section at 7 and 8 TeV using ep 
b-tag events together with results at 
7 TeV using the ee, pu, and ey channels 
measured by ATLAS at the LHC [33]. 


background (see Chapter 12 for a discussion of reducible and irreducible 
backgrounds) is estimated in such a way as to minimize the reliance on 
Monte Carlo simulations. The general strategy is based on the use of 
kinematic selections to define ‘control’ regions (in which the events arise 
mainly from one background source) and ‘signal’ regions. This method- 
ology is explained in Section 13.5.3, where the backgrounds are much 
more significant. For this measurement, we define a control region in 
the measured data in which the mass of two same-flavour opposite-sign 
(SFOS) leptons is compatible with the Z. The Monte Carlo simulation 
is only used to extrapolate this number to the signal region in which 
this mass is incompatible with the Z. The distribution of number of jets 
in this signal region before and after b-tagging [25] is shown in Fig. 13.9. 
A clean signal is observed above the background even before b-tagging, 
but the power of b-tagging to greatly enhance the purity of the signal 
is clearly demonstrated. The measurements of the tt cross section at 
CMS energies of 7 and 8 TeV from ATLAS [33] are shown in Fig. 13.10. 
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The good agreement between the data and the SM for these and many 
other results at the LHC gives us confidence in the reliability of the SM 
in this energy regime. We can therefore use the SM to reliably predict 
backgrounds for a wide range of possible new physics processes. 

One other very important measurement is the mass of the top quark, 
because this affects the mass of the W and Higgs bosons via radiative 
corrections (see Chapter 7). This was used in the past to pin down the 
mass of the SM Higgs boson, but now that we have observed a Higgs bo- 
son, we can combine the three precision mass measurements in a powerful 
consistency check of the SM. Any significant inconsistency would pro- 
vide evidence of new physics. The current world average based on the 
Tevatron measurements by CDF and DO and the ATLAS and CMS 
measurements at the LHC gives [19] a value of m, = 173.34 + 0.76 GeV. 
This result is consistent with the SM, but further improvements in the 
precision will be possible with more data at the LHC. 


13.4 Beyond the Standard Model physics 


There are several reasons to expect BSM physics to emerge at the TeV 
scale being explored at the LHC. The first general argument is simply 
that there are too many free parameters in the SM and a more unified 
theory should contain fewer. The Higgs mechanism is consistent with 
current data, but the theory is rather contrived and it is hoped that BSM 
physics will be able to explain the origin of the spontaneous symmetry 
breaking required in the Higgs mechanism. 

An important argument that points to new physics being manifest at 
the TeV scale is known as the hierarchy problem in the SM. The mass 
of the Higgs boson has radiative corrections from the Feynman diagram 
with a fermion loop (see Fig. 13.11). In this Feynman diagram, we have 
to consider the propagators for the fermions in the loop. In Chapter 7, 
we only evaluated Feynman diagrams with boson propagators (e.g. the 
W= or Z). The Feynman rules for a spin-4 propagator (see Langacker 
in Further Reading) of momentum q and mass m give a factor! 


1 i 
í (G) ~ 7H qu —m ines 


Momentum has to be conserved at every vertex in a Feynman diagram, 
but this allows for an infinite range in the momentum of the virtual fer- 
mion. To evaluate this Feynman diagram, we therefore have to integrate 
over the momentum q of the virtual fermion in the loop. After averaging 
over the spin states by taking the trace of the fermion propagators [80] 
of the fermion, we can show that the contribution to the squared mass 
of the Higgs boson is given by 


(Amn)? = -ca | agt ( z ! ) (13.10) 
0 


edu — MYu m 


Fig. 13.11 Feynman diagram for the 
radiative correction to the mass of a 
Higgs boson from a fermion loop. 


155ee Chapter 6 for the definition of the 
Yu matrices. 


368 LHC and BSM 


16We might be worried that we would 
have similar quadratic divergence for 
fermion loops between gauge bosons in 
the SM. However, the SM contains an 
approximate chiral symmetry that pre- 
vents this happening (see Aitchison in 
Further Reading). 


17The Planck scale, ~10!9 GeV, is 
the energy scale at which quantum- 
gravitational effects become important. 
The GUT scale, ~10!5 GeV, is the scale 
in a grand unified theory at which the 
strengths of the electromagnetic and 
weak interactions become equal. 


Fig. 13.12 Feynman diagram for the 
radiative corrections to the mass of 
the Higgs boson from a scalar boson 

loop diagram. 


where gr is the coupling constant for the fermion—Higgs boson interaction 
and C contains other numerical constants. Note that eqn 13.10 gives 
the contribution from one type of fermion, so we need to sum over all 
fermions. However, the dominant contribution comes from the top quark 
since it has the largest coupling because it is by far the heaviest fermion. 
The negative sign arises because there is a fermion loop and this is 
related to the opposite intrinsic parities of fermions and antifermions. 
We can write d4g = dqo dq = dqo |q|? djq| dQ. Therefore, simply by 
counting the powers of q, we can see that this integral is ‘quadratically 
divergent’,!° i.e. if we introduce a cutoff A, the integral will have a term 
proportional to A?. The leading term gives 
2 

(Amy)? = -IA (13.11) 
If there is no new physics up to some scale A, then the mass squared of 
the Higgs boson will also have radiative corrections of the order of A?. 
If A is given by the Planck scale or even by a GUT scale,!” then the 
natural mass for the Higgs boson would be of the order of 10!° GeV 
or 1015 GeV, respectively. We know that the physical mass of the Higgs 
boson is of the order of 100 GeV, so in the SM this requires counterterms 
that have to be fine-tuned to 1 part in 1015 to ensure the nearly perfect 
cancellation of the bare mass with the radiative correction. This is called 
the hierarchy problem. 


13.4.1 Supersymmetry 


The most popular solution to the hierarchy problem is to invoke super- 
symmetry (SUSY). This is a symmetry that transforms bosons into 
fermions and vice versa. This implies that every SM particle must have a 
superpartner with spin differing by L. If SUSY is correct, then the Higgs 
boson mass squared will receive radiative corrections from Feynman dia- 
grams with a scalar loop as shown in Fig. 13.12. The contribution to the 
square of the u parameter (see Chapter 12) is given by (see Aitchison in 


Further Reading) 
© dik 
Ag? = CÀ a 13.12 
w=0r aog (13.12) 


where C is a numerical constant. As for the calculation of the fermion 
loop above, we can write d*k = dko d?k = dko |k|? d|k| dQ, which con- 
tains the fourth power of k, and hence this integral is also ‘quadratically 
divergent’. This means that if we introduce a cut-off parameter A to the 
integral, we find (C’ is another numerical constant) 


Ag? = Gar? (13.13) 


The radiative correction to the u? parameter then introduces a correction 
to the mass squared of the Higgs boson of 


2. Ús ,2 
(Ama) = an 


where gs is the coupling constant for the scalar interaction. 


(13.14) 
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Comparing eqns 13.11 and 13.14, we see that we will have perfect 
cancellation of the unwanted quadratic divergence if gf = gs and we 
have two scalar partners for every SM fermion. This happens naturally 
in SUSY, which is an extension of the symmetries of the SM relating 
bosons and fermions. In SUSY, the two scalar particles for each SM 
fermion arise from the left- and right-handed fermions. The equality of 
the coupling constants is guaranteed by the symmetry. 

Unbroken SUSY therefore solves the hierarchy problem and introduces 
no new parameters into the theory. However, SUSY is manifestly broken, 
because we have not yet discovered any of the SUSY partners of the 
SM particles, so their masses must be greater.'® However, if the masses 
are very much heavier than the SM partners, then some fine tuning 
would be required to prevent the Higgs boson mass becoming too large. 
Therefore, ‘naturalness’ arguments suggest that the masses of the SUSY 
partners should be of the same order of magnitude as the Higgs vacuum 
expectation value (246 GeV). 

There are many different options for how SUSY is broken.!? The 
masses of the superpartners are not protected by the gauge symmetries 
that force the masses of fermions and gauge bosons to be zero before 
spontaneous symmetry breaking is considered. In this sense, it is not 
surprising that the masses of SUSY particles are larger than those of 
their SM particles. 

The particle content of the SUSY extension to the first generation of 
the SM is given in Table 13.1. SUSY particles are labelled with a ‘tilde’ 
(~) above their name; for example, the SUSY partner of the electron is 
the selectron č. SUSY partners of quarks and leptons are called squarks 
and sleptons, respectively. SUSY partners of neutrinos, gauge bosons, 
and Higgs bosons are called neutralinos, gauginos, and Higgsinos. 

After SUSY breaking, there will in general be mixing of all states with 
the same quantum numbers. For example the ‘left’ and ‘right’ t-quark 
states will mix to form the physical states called f4 and fa. This mass 
splitting can be very large, so it is possible that the lightest squark 
belongs to the third generation. The Higgs sector in SUSY is extended 
compared with the SM because two complex Higgs doublets are required 
to give masses to the up-type and down-type quarks. Each complex 
doublet consists of four fields. After spontaneous symmetry breaking, 
three of the eight fields are ‘swallowed up’ to give mass to the WF and 
Z? bosons. This leaves five physical spin-0 fields: two charged fields (H+) 
and three neutral fields (a pseudoscalar A and two neutral scalars h and 
H). There are also SUSY partners for these states. There is then mixing 
between the SUSY partners of the neutral gauge bosons J, Z° and the 
neutral Higgsino states h, H to give the physical states X}, x9, x9, x9, 
where the states are labelled in order of increasing mass. Similarly, the 
W+ states mix with the Ñ+ states to give the charginos x], x3. 

Although there is currently no direct evidence for SUSY, there are 
some suggestive hints that it might be a valid low-energy theory (i.e. at 
the electroweak scale ~100 GeV): 


e SUSY helps with the unification of the electroweak and strong 
coupling constants. If we assume that there should be some grand 


18There is clearly no scalar charged 
particle with the same mass as the 
electron (511 keV). 


I9susy breaking must be done in such 
a way as to preserve the gauge symmet- 
ries in the SM, so that the SM particles 
only acquire mass through spontaneous 
symmetry breaking. 
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SM particles Spin SUSY partners Spin 
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Table 13.1 Example of SUSY particles for the first generation of quarks and leptons 
and for the gauge bosons. 


unification of the electroweak and strong couplings at some high 


20 The renormalization group equations energy, then we can use the ‘low-energy’ data and the renormal- 
are derived from the underlying group ization group equations”? to see if this happens. In the context 
theory and allow the determination of : : Betis 
mane D of the SM, the coupling constants do not quite coincide at any 
the change of coupling ‘constants’ with , , 
scale. See the discussion in Chapter 9 scale. However, if one assumes SUSY with a mass scale of the or- 
for a simple explanation in the context der of a TeV, then the coupling constants do coincide as shown in 
of QED and QCD. Fig. 13.13. 
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Fig. 13.13 Evolution of the coupling 
constants in the SM and in SUSY [70]. logy lQ (GeV)] logy iQ (GeV)] 
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e In SUSY models, we can ensure that the lightest supersymmetric 
particle (LSP) will be stable (see Section 13.4.2) and will therefore 
provide a natural candidate for dark matter. This requires SUSY 
particles at a mass scale of ~TeV. 


e In the SM, spontaneous symmetry breaking has to be put in by 
hand. The mass terms in the Lagrangian evolve with scale (‘run’). 
If the mass-squared term for the Higgs potential is positive at some 
high (unification) scale, then the large top-quark Yukawa coupling 
will drive this term negative (see Chapter 12), thus giving some 
explanation for the existence of spontaneous symmetry breaking 
at low energy. This argument also requires that the mass scale of 
the SUSY partners is ~TeV (see Martin in Further Reading). 


e In the SM, the Higgs boson mass is unconstrained. In the minimal 
SUSY SM (MSSM) at ‘tree level’,?! the Higgs boson mass should 
be given by [83] 


es = M% cos? 26 (13.15) 


where ( is the ratio of the vacuum expectation value of the Higgs 
boson coupling to up-quark and down-quark flavours. Therefore, 
at tree level, the Higgs boson is constrained to be lighter than 
the Z. However, there are radiative corrections to the Higgs boson 
mass and an upper limit is given by [83] 


39°m4 M2 X? X2? 
Ami; = |] 5 | 4 1 : 13.16 
TH = Sr M2 | (m?) M? T2M2 aaa) 
where g is the weak coupling constant, Ms is the geometric mean 
of the stop quark masses (M? = m;,m;,), and X; is a stop mixing 
parameter. The MSSM still predicts a relatively low value my < 


140 GeV, which is consistent with the recent results from Higgs 
boson searches at the LHC, which give my ~ 125 GeV. 


13.4.2 R-parity 


One potentially fatal problem with SUSY is that in general it will allow 
quarks to convert to leptons and thus allow Feynman diagrams that 
would mediate very rapid proton decay as shown in Fig. 13.14. 

The simplest solution to this problem is to invoke a new multiplica- 
tive parity, called R-parity, given by R, = (—1)3(8-4)+?s, where B is 
the baryon number, L the lepton number, and s the spin. All the SM 
particles have Rp = 1 and all the SUSY partners have Rp = —1. This 
has profound consequences: 


e The lightest SUSY particle (LSP) is absolutely stable, since any 
decay to SM particles would violate R-parity. This has the attract- 
ive feature that the LSP is a natural candidate for dark matter (see 
Section 13.7). 


2lBy tree level we mean a calculation 
using only the lowest-order Feynman 
diagrams. 
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Fig. 13.14 Feynman diagram 


mediating proton decay in SUSY, 
p—> etn’. 
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e Heavier SUSY particles will decay to an odd number (usually one) 
of SUSY particles (and SM particles). 


e SUSY particles will always be produced in pairs in collider 
experiments. 


e The LSP will have very small cross sections for interactions with 
ordinary matter because any interaction will be suppressed by the 
high SUSY mass scale. Therefore, from the experimental point of 
view, LSPs will behave like neutrinos by not interacting in detect- 
ors. It is assumed that the LSP is neutral, because otherwise there 
would be a measurable relic density in nuclei. 


Even in the context of R-parity-conserving SUSY models, there are 
far too many parameters to allow a model-independent survey of the 
full parameter space. One way to get round this problem is to assume a 
simplified model in which some of the key parameters of the MSSM are 
set by hand. 


13.4.3 Other BSM theories 


Another solution to the hierarchy problem is to replace the fundamental 
Higgs scalar with a composite object. Theories based on this idea are 
called ‘technicolour’, but are not discussed here since it is difficult to 
construct such a theory that is compatible with all the precision elec- 
troweak data. Technicolour theories are also disfavoured by the LHC 
observation of a boson that is compatible with the properties of the SM 
Higgs boson. 

Another interesting option is to assume that there are ‘large’ extra 
dimensions. In these models, the SM particles are restricted to a four- 
dimensional ‘brane’ but gravity can propagate in the higher-dimensional 
‘bulk’. Gravity then appears weak in our four-dimensional world be- 
cause the gravitons can propagate into the bulk. At distances shorter 
than the characteristic length of the theory, there would be deviations 
from the inverse square law for gravity. However, experiments to detect 
such an effect have not found any deviation from the inverse square law. 
In these models, the fundamental Planck scale can be as low as ~TeV. 
This means that the cut-off in the divergent integral for the Higgs boson 
mass is ~TeV, thereby avoiding the hierarchy problem. 

There are many versions of these models, but one generic feature is 
that at TeV scales particles can leak from the brane into the bulk and 
thus cause an apparent violation of momentum conservation in four 
dimensions. Therefore, from an experimental point of view, we would 
expect events with large missing transverse momentum. More specula- 
tively, some of these models predict the production of micro black holes 
at the LHC with significant cross sections. The non-observation of mi- 
cro black holes at the LHC therefore provides severe constraints on these 
theories. 
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13.5 Experimental searches for BSM 
physics at the LHC 


There are a bewilderingly large number of possible BSM theories that 
might be experimentally accessible. It is therefore essential that some 
searches are geared to finding unexpected signals while others are op- 
timized for some particular model, such as SUSY. This section merely 
aims to give the flavour of these searches. 


13.5.1 Jet production 


The large QCD jet production cross sections at LHC provide access to 
significant numbers of events at very high mass. These can be used to 
search for resonances in the mass spectra as would be expected from 
models like excited quarks as well as new contact interactions. A par- 
ticular sensitive way to search for new physics is to use the angular 
distribution in the dijet CMS, because many experimental systematic 
effects (such as the jet energy scale) are thereby greatly reduced. The 
angular distribution is studied as a function of the variable 


_ 1+ cos 0" 


= See 13.1 
x 1 — cos @* ao) 


where 0* is the polar angle of one of the jets in the dijet CMS system. 
The advantage of the x variable is that the forward-peaked angular 
distribution for 0* is transformed into a more uniform distribution as 
a function of x. New physics such as a contact interaction will tend to 
produce a more isotropic distribution in cos @* and therefore a peak at 
small values of x. The distribution of x for a range of dijet invariant 
masses [32] is shown in Fig. 13.15. No significant deviation from the SM 


is seen and limits on new physics can be placed.?? 221 this case, limits were placed on 
models with extra dimensions, but the 
data can be used to place limits on 


13.5.2 Lepton pair resonances other BS M physics: 


Many BSM theories predict extra U(1) or SU(2) symmetries, which 
would result in heavier versions of W and Z bosons, called W’ and Z’. 
The signature for a W’ would be a Jacobian peak in the transverse mass 
(see Chapter 8, eqn 8.22) above My. The signature for a Z’ would be 
a narrow peak in the [+/~ invariant mass distribution above Mz. These 
would give clear signals above the SM backgrounds, but so far only upper 
limits have been reported; see for example Fig. 13.16 [27]. 


13.5.3 SUSY searches 


If we assume R-parity-conserving SUSY, then SUSY particles will be 
produced in pairs, such as gg. The cross sections can be calculated at 
the parton level, since all the couplings are the same as in the SM. Using 
the measured parton distribution functions (see Chapter 9), we can then 
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Fig. 13.15 Distribution of x for slices 
of dijet invariant mass, measured by 
ATLAS and compared with QCD pre- 
dictions (shaded bands) and a BSM 
theory with extra dimensions (dashed 
line) [32]. 


Fig. 13.16 Distribution of dilepton in- 
variant mass, measured by ATLAS and 
compared with SM expectations and 
models with different Z’ masses [27]. 
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calculate the cross section for SUSY pair production in pp collisions. 
The results are shown in Fig. 13.17 as a function of the average mass 
of the pair of sparticles [119]. There is a relatively large cross section 
for strongly interacting squarks and gluinos, but the cross sections for 
particles produced via electroweak interactions are much smaller. The 
cross section for a given type of particle decreases rapidly with the mass 
of the particle. 

While the production cross sections are well defined, the decay modes 
for a given sparticle depend on which modes are kinematically allowed. 
They also depend on the model parameters that are used for SUSY 
breaking. However if we assume f-parity conserving SUSY, then the 
end result of the decays of two sparticles will be two LSPs. The LSPs 
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will not be detected directly in a detector. Therefore, a large value of 
Em'ss will be used as a signature for SUSY searches. In order to establish 
evidence for a possible SUSY signal, all possible sources of background 
must be understood. After removing instrumental backgrounds such as 
‘beam halo’ interactions?’ and fake E™'ss generated by the finite detector 
resolution, the main backgrounds are SM processes that have genuine 


Em's. from final-state neutrinos. Some of the main SM backgrounds are 


(1) Z + jets with Z > vi; 
(2) W + jets with W = lv; 


(3) tt with at least one top quark decaying semileptonically to produce 
neutrinos. 


The first reaction is an example of an irreducible background in that even 
with a perfect detector one could not separate it from signal on an event- 
by-event basis. Therefore, it is essential to have a very reliable prediction 
of the rate for this process. This can be done in a largely data-driven 
way, thus minimizing theoretical uncertainties: the rate for Z + jets with 
Z — I*I- can be measured directly, and if the I+] are removed, this 
provides a simulation of Z — vv. The ratio of branching ratios of Z 
to leptons and neutrinos is very well known from LEP. The effect of 
finite efficiencies for triggering and identifying charged leptons can be 
measured in a data-driven way using Z — I*I~ events. A Monte Carlo 
calculation is only required to allow for the finite detector acceptance of 
the charged leptons (the acceptance for neutrinos is 100%). The other 
reactions generate ‘reducible’ backgrounds that could be completely re- 
jected with an ideal detector because they result in charged leptons, 
whereas some of the SUSY signal will not contain any charged leptons. 
Data-driven techniques are required to estimate the effects of the finite 
detector efficiency. 


Fig. 13.17 Predicted cross section for 
the production of pairs of SUSY par- 
ticles for pp collisions at a CMS energy 
V/s = 8 TeV. From [119]. 


23 Beam halo’ refers to beam particles 
that have escaped from the beam pipe. 
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24This is a simplified discussion, since 
it assumes that each control region is 
only populated by events from one SM 
process. Nevertheless, it should give a 
clear idea of the approach used. Note 
that it does not entirely eliminate the 
reliance on Monte Carlo simulations, 
but it only requires the Monte Carlo 
predictions of ratios, so the system- 
atic uncertainties are greatly reduced 
compared with relying on Monte Carlo 
to predict the absolute number of SM 
events in the SR. 


Background 


Number of events 


X 


Fig. 13.18 Distributions for signal 
and background. The g axis is a proxy 
for a suitable variable like E55. 


The kinematics of SUSY signals and SM backgrounds are very differ- 
ent and this is exploited in SUSY searches. SUSY events will typically 
contain more high-pr jets than the SM backgrounds, and SUSY events 
will tend to be at larger values of Ess than those from the SM. At 
low values of E's, the distribution will be dominated by the SM back- 
grounds but any significant excess at high values of Ess would be 
evidence for SUSY. 

A general approach to estimating the SM backgrounds in searches for 
BSM physics is based on using Monte Carlo simulations to 


(1) Define suitable signal regions (SR) in order to optimize the discov- 
ery potential for a particular range of parameters of a model. That 
is, we want to maximize the acceptance for BSM physics while 
simultaneously minimizing the number of SM background events 
that would be expected. 


(2) Define appropriate control regions (CR) that isolate one particular 
SM process. In such a region, we would expect this process to be 
dominant and we would not expect any significant ‘contamination’ 
from BSM physics. 


We can then measure the number of events in the control region in real 
data (NER) and Monte Carlo simulated data (Ng). Finally, we need 
to use the Monte Carlo simulation to calculate the ‘transfer factor’ (TF), 
defined by 


SR 
— Nuc 


TF = 
NG 


(13.18) 


The predicted background for a given process”* is then given by 


Npredictea = TF RN. (13.19) 
While this approach works well for SM processes involving W,Z, or 
top-quark production, it cannot be used to predict QCD-induced back- 
grounds such as ‘jets’ that are misidentified electrons or ‘fake’ Emiss 
arising from the finite detector resolution. In general, the QCD cross sec- 
tions are so much larger than those for BSM physics that a data-driven 
method must be used. There are several ways of doing this. One option 
is to perform a ‘template’ fit. For example, to estimate the background 
from ‘fake’ electrons (i.e. QCD jets wrongly identified as electrons): 


e A pure QCD sample is defined by reversing electron selection cri- 
teria, and this is used to measure the shape of the background 
distribution. 


e The Monte Carlo simulations are used to predict the shape of the 
sum of the other SM processes, excluding QCD multijets. 


This method works if the shapes of the two distributions are sufficiently 
different, as indicated in the sketch in Fig. 13.18. The data distribution 
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is then fitted to the sum of the two distributions, and the fit can then 
be used to determine the number of background events in any SR. 

As discussed in Section 13.4.1, the number of free parameters in the 
minimal SUSY model (MSSM) are so large that it is not feasible to search 
the full parameter space. One example of a search [36] for squarks was 
performed with the assumption? that the gluinos are much heavier than 
the squarks and that the squarks decay with 100% branching ratio as 
q — qx}. The signature for the strong-interaction production of pairs of 
squarks is therefore (at least) two high-py jets plus large E™'Ss. Another 
SUSY search targets very large jet multiplicities that can arise from cas- 
cade decays of SUSY particles. The dominant background source for this 


miss 


analysis was from mismeasured values of E75. The experimental uncer- 
tainty on the measurement of E's is found to scale as E@'SS x Hy, 
where Hr is the sum of the transverse momenta of the jets. Therefore, 
the signal (control) region is defined by events with large (small) val- 
ues of E's /\/Hy. The distribution of Eis /./Hy was measured for a 
lower-jet-multiplicity sample and this was then used to scale the number 
of background events in the control region to the signal region. The re- 
sulting distribution [31] of E¥'ss / Hr is shown in Fig. 13.19. The data 
are in good agreement with the background calculations, and power- 
ful limits on the masses of the squarks and gluinos can be obtained as 
shown in Fig. 13.20. The curves show that for a particular simplified 
SUSY model, a clear excess would have been expected at large values of 
Ee / JH . 

The hierarchy argument suggests that the mass of the stop squark 
should be relatively low (see eqn 13.16) but the masses of the other 
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25 Other analyses make different model 
assumptions. 


Fig. 13.19 Distribution of Eiss //Hr 
measured by ATLAS in a signal region 
with the number of jets >10. The SM 
backgrounds and the expected signal 
for SUSY signal for a simplified model 
with masses given by m(g) = 900 GeV 
and m(x?) = 150 GeV are shown [31]. 
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Fig. 13.20 Limits in the plane of the 
mass of the x? versus the mass of 
the lightest ĝ, obtained in simplified 
models with the assumption of one non- 
degenerate squark or eight degenerate 
squarks [31]. The areas below the solid 
curves are excluded at 95% confidence 
level. 


Fig. 13.21 Kinematically allowed de- 
cay modes of the £ for different regions 
in the mass plane (mg m g) [34]. 
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squarks could be much higher without violating ‘naturalness’. There- 
fore, the search for stop squarks is particularly important. In general, the 
SUSY signal for tt production will suffer from SM backgrounds from tt 
production. The kinematically allowed decay modes for the stop depend 
on the masses of the stop and the neutralino, as illustrated in Fig. 13.21. 
Several searches for the stop have been performed. One particularly 
powerful search used final states with one-lepton and multijets [34]. The 
analysis used several signal regions to target the kinematics for the dif- 
ferent regions in the mass plane (m;,m,0o). The results were consistent 
with the SM, and the resulting limits are shown in Fig. 13.22. 

In some SUSY models, the squarks and gluinos are very heavy and 
the lightest SUSY particles are the charginos and neutralinos. Therefore, 
another interesting way of searching for SUSY is to select events with 
charged leptons and large values of E®'ss, There are many possible SUSY 
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production and decay chains that would lead to these final states. One 
analysis [35] considered the electroweak production of x7 x$, followed 
by either the decay chain xf — vl and Č — vx? or x7 —> lv and 
i — IX, together with similar decay modes for the x} (see Fig. 13.23). 
The results shown in Fig. 13.24 represent a large improvement on the 
limits from LEP2.?6 Many other electroweak processes can be considered 
in the search for charginos and neutralinos. For example, the Feynman 
diagram for the electroweak production of y?y7 is shown in Fig. 13.25. 
In many models, the decays of the yf result in a charged lepton. This 


results in a distinctive experimental signature of a charged lepton and a 
large value of Eys. 


13.5.4 Summary of searches for new physics 


The limits on the masses of SUSY particles and other exotic physics 
have been greatly extended by the LHC, but as of 2015 there is no 
clear evidence for any new physics at the LHC. However, the mass reach 
will be greatly extended by the increase in energy to 13 TeV and the 
planned increase in luminosity up to and beyond the nominal value of 


1034 cm~?s~!, so there are prospects for exciting discoveries in the next 
few years 


13.6 Linear collider 


The observation at LHC of a low-mass Higgs-boson-like particle gives a 
strong motivation for studying its properties in more detail in an ete7 
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Fig. 13.22 Limits in the plane of the 
mass of the x? versus the mass of the 
t [34]. The areas below the solid curves 
are excluded at 95% confidence level. 


26 At LEP2, the limits on charged- 
particle production are to a good ap- 
proximation given by the beam energy, 
since charged particles are produced in 
pairs. 


Fig. 13.23 Electroweak SUSY 
production processes for x7 x3, with 
decays via sleptons or sneutrinos [35]. 
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Fig. 13.24 Mass limits in the 


plane 


of the masses of the x? and yj in a 


simplified SUSY model [35]. 


Fig. 13.25 A Feynman diagram for 
electroweak chargino—neutralino 


production. 
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Fig. 13.26 Production of Higgs 
bosons in et e~ annihilation. 
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collider. In an ete~ collider, a Higgs boson would be produced by the 
‘Higgsstrahlung’ process (see Fig. 13.26) ete~ — HZ. The minimum 
CMS energy for this process is mz + my, but a useful rule of thumb 
is that to get a sufficiently large cross section requires an energy of 
~ my + 100 GeV. Given the mass of the boson observed at LHC (see 
Chapter 12), the minimum energy of an ete~ collider to be operated 
as a ‘Higgs factory’ would be ~230 GeV. In such a process, the events 
could be tagged by detecting the Z decay products, so that all decay 
modes of the Higgs boson could be studied precisely to determine their 
branching ratios. In the SM, all the branching ratios of the Higgs boson 
can be calculated once its mass has been determined, and therefore this 
would provide stringent tests of whether the properties of the observed 
particle are consistent with those expected in the SM. Using polarized 
electron beams would also allow a determination of the spin and parity 
of the Higgs boson (the LHC measurements strongly support the SM 
assignment of a spin/parity of 07). 

However, to confirm that the particle really was a Higgs boson, we 
would need to confirm that the self-coupling of the Higgs boson exists 
with the expected strength given by the SM. This could be studied 
with higher luminosity at the LHC or at an ete~ collider if it had 
sufficiently high CMS energy. If the CMS energy of a linear collider were 
larger than twice the top mass, many precise measurements would be 
possible. The mass of the top quark could be measured with a precision 
an order of magnitude better than achievable at LHC. By comparing 


the measurements of the mass of the W, the top quark, and the Higgs 
boson, a precision test of the radiative corrections in the SM would be 
possible. The interest here is that any deviations from the SM prediction 
would give indications of new physics at higher mass scales, through 
loop diagrams. Precision measurements of the top-quark axial and vector 
couplings could be made using the electron polarization and again any 
deviations from the SM predictions would give sensitivity to new physics. 

If the CMS energy were large enough, an e*e~ collider could be used 
to study SUSY particles. This would require that the beam energy be 
larger than the lightest SUSY charged particle. This might require a 
much higher-energy machine than would be required for the study of 
the Higgs boson and top quark. Such a machine would have to be a 
linear collider, because a circular machine would have unacceptable syn- 
chrotron radiation losses or require an unrealistically large radius. R&D 
for a linear collider is based around the development of high-gradient 
superconducting radiofrequency (RF) cavities. Currently, gradients of 
~35 MeV m™! have been achieved. In order to achieve useful interaction 
rates, very high luminosity would be required. As a linear collider is 
a single-pass machine (unlike circular machines), this will require very 
small beam sizes. This makes many demands on the machine: 


e high-quality, low-emittance beams of both electrons and positrons; 


e powerful final focus to reduce the beam sizes in the vertical 
dimension to ~1 nm. 


e precise alignment of the electron and positron beams and fast- 
feedback systems to ensure that the beams collide. 


Considerable progress has been made in these issues, but more remains 
to be done before such a machine could be built. 

Looking further into the future, if a higher-energy ete~ collider were 
to be built, it would probably require the more exotic technology being 
developed for the CLIC (Compact Linear Collider). This is aimed at a 
collider with a CMS energy around 3 TeV. To keep the length affordable, 
accelerating gradients of ~100 MeV m~! would be required, which is too 
large for the superconducting RF technology (see Chapter 3) being devel- 
oped for lower-energy linear colliders. Such gradients could be achieved 
using a two-beam-acceleration concept in which the RF power is gen- 
erated by a very high-current (J ~ 100 A) electron beam (drive beam) 
running parallel to the main beam. This drive beam is decelerated and 
the generated RF power is transferred to the main beam. 


13.7 Dark matter 


There are strong indications from astrophysics that the universe contains 
a large amount of non-baryonic matter. This matter only has weak and 
gravitational interactions and cannot emit electromagnetic radiation,?” 
hence the name ‘dark matter’. The evidence for this dark matter comes 
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27 There are models in which dark mat- 
ter only has gravitational interactions, 
but they are not discussed here as they 
are almost impossible to test directly 
experimentally. 
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Fig. 13.27 General picture of 
quark—WIMP scattering. The WIMPs 
are labelled as y°. 


284 classic example of an EFT is 
the Fermi theory of weak interactions, 
which ignores the effects of the W 
propagator and treats the interaction as 
a four-particle interaction. The Fermi 
theory works very well at low energies 
(~10 MeV) because of the very high 
mass of the W boson. 


29 an alternative attempt to explain 
this and other effects is to assume that 
the inverse square law needs to be 
modified at large values of r. 


from its gravitational interaction with the luminous normal matter and 
this is briefly reviewed in Section 13.7.1. If such dark matter exists in the 
form of weakly interacting massive particles (WIMPs), there are three 
different approaches to studying them in laboratory or astroparticle ex- 
periments, which can be explained in terms of the cartoon shown in 
Fig. 13.27. 


(1) Direct detection: In this approach, we use the interactions of a 
WIMP with a quark in a nucleus and try to detect the resulting 
nuclear recoil. This is discussed in Section 13.7.2. 


(2) Accelerator detection: If we collide quarks (or other partons) at 
sufficiently high energy, we can pair-produce WIMPS. As WIMPs 
will not interact in the detector, this process is observable in events 
in which one of the initial-state quarks radiates a gluon or a W 
or Z, since this then results in events with large values of Em's’. 
These events can be searched for in the same way as SUSY (see 
Section 13.5.3). 


(3) WIMP annihilation: If WIMPs accumulate under gravity, there 
will be significant rates of annihilation. The search for these 
reactions is discussed in Section 13.7.3. 


This qualitative comparison between the accelerator searches and direct 
detection can be made more quantitative using effective field theories 
(EFTs) to describe the interactions between quarks and WIMPs. An 
EFT provides a low-energy approximation to a fuller but not yet known 
theory. The full theory would give accurate predictions at high energies, 
whereas the EFT would be expected to fail at high energies. If the separ- 
ation between the two energy scales is large enough, the EFT may give 
reliable predictions at low energy.?S 


13.7.1 Astrophysical evidence for dark matter 


There are several aspects of the astrophysical evidence for dark matter. 
One approach uses galactic rotation curves. The velocities of stars can be 
measured from their Doppler shifts and these can be compared with the 
velocities calculated assuming Newtonian gravitation. The latter calcu- 
lation assumes that the distribution of gravitational matter follows that 
of the luminous matter. In this case, using Newtonian mechanics and the 
inverse square law of gravity,” one expects to find the rotation velocity 
u(r) scaling with distance r from the galactic centre as u(r) x 1/,/r for 
values of r larger than the bulk of the luminous matter of the galaxy. 
However, the rotation curves are usually much flatter at large r (see 
Fig. 13.28 for an example of a galactic rotation curve [47]). If one as- 
sumes that the inverse square law of gravity is correct, then there must 
be a halo of non-luminous matter, called ‘dark matter’. To produce a 
flat rotation curve, the dark matter mass contained within a radius r 
must scale like M(r) œ r or the density must scale like p(r) x 1/r? (at 
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very large values of r, the density must fall off faster in order for the 
galactic mass to be finite). 

Additional evidence comes from the motion of galaxies in clusters of 
galaxies. A striking piece of evidence for dark matter comes from the 
observation of the Bullet Cluster, in which two clusters have collided 
with one another. The distribution of gravitational mass can be inferred 
from gravitational lensing (see Perkins in Further Reading) and is found 
to be very different to the distribution of baryonic mass as determined 
from the observed light. The baryonic matter interacted via electromag- 
netic interactions and shows evidence for shock waves. Therefore, the 
interactions between the dark matter and the baryonic matter must be 
very much weaker than this. 

From global fits to these and other data, the dark matter content of the 
universe is much greater than that of ordinary baryonic matter. Direct 
dark matter searches are discussed in Section 13.7.2 and the search for 
dark matter annihilation is discussed in Section 13.7.3. The accelerator 
(LHC) search is not discussed further here, since the methodology is 
similar to that of the search for events with large values of E's that 
has been discussed in the context of SUSY. 


13.7.2 Direct dark matter detection 


At first sight, it might seem surprising that it is difficult to detect dark 
matter if there is much more dark matter than ordinary matter. The 
problem is that the dark matter particles are expected to be ‘cold’, with 
a typical speed of ~100kms~! and they only interact weakly. For typical 
ranges of WIMP mass, 10 GeV to 10 TeV, the WIMP-nucleon interaction 
will be by elastic scattering. The signature of such an interaction would 
be nuclear recoil with energy in the range 1-100keV. This energy is 
very low and requires specialized detection techniques. The cross sec- 
tions are very small and so the event rates will be low. This presents an 
enormous experimental challenge to detect a signal above background. 
It is essential to reduce the rate of cosmic-ray interactions, which is 
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Fig. 13.28 Rotation curve for the gal- 
axy NGC 6503 [47]. 
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usually done by performing the experiments in deep underground cav- 
erns. The next background to consider is that from natural radioactivity. 
This can be reduced by very careful control of all the materials used in 
the detectors. Therefore, it is advantageous if the detector can have a 
powerful separation between the nuclear recoil signals and radioactive 
backgrounds. The detectors obviously have to be large to get a good 
sensitivity and there is no perfect solution to these challenges. There are 
currently many different approaches to these challenges. The following 
are some of the techniques used: 


e Scintillation detectors: For example, the DAMA detector uses 
sodium iodide crystals as scintillators. Some separation between 
the signal from nuclear recoils and the backgrounds can be ob- 
tained from the measurement of the signal rise time. If there is a 
dark matter signal superimposed on a background, there should be 
an annual modulation because of the velocity of the Earth relative 
to the dark matter wind. The DAMA Collaboration have claimed 
to see such an effect, but the claim is controversial and has not 
been confirmed. 


e Semiconductor detectors: These include germanium detectors. 
Again, the rise times can be used to provide some discrimination 
between nuclear recoils and radioactive backgrounds. Very low 
thresholds are required to see the small signals. If the detectors are 
cooled to very low temperatures, they can also be used as phonon 
detectors (see below) and the combination of the ionization and 
phonon signals provides powerful discrimination between nuclear 
recoils and backgrounds. 


e Phonon detectors: Cryogenic detectors can be used to detect the 
nuclear recoil energy that is converted to phonons. One technique 
is to maintain a superconductor at the superconducting transition 
temperature so that a very small phonon signal will cause it to 
have a large and detectable change in resistance. If, in addition, 
the crystals are scintillating at cryogenic temperatures, the scin- 
tillation light can be read out using photomultipliers. This is very 
useful, because the correlation between the phonon and scintilla- 
tion signal is very different for nuclear recoils and the backgrounds 
from ĝ and y radioactive sources. The CRESST experiment uses 
this technique with calcium tungstate crystals as the scintillator. 


e Noble liquids and gases: Noble liquids such as xenon can be 
used to detect scintillation light. Dual noble liquid and gas de- 
tectors have also been used. A large electric field is used to drift 
the ionization electrons towards the gas in a wire detector. This 
allows the simultaneous measurement of the scintillation and ion- 
ization yield. Nuclear recoils from WIMPs are moving slowly and 
so tend to produce small direct ionization signals (S1) but they do 
produce scintillation light (S2) efficiently. The backgrounds from 8 
and y rays will also produce scintillation light, but they will also 


produce larger ionization yields. Therefore, the ratio S2/S1 pro- 
vides powerful discrimination between nuclear recoils (signal) and 
radioactive backgrounds. A diagram of the LUX detector [14] is 
shown in Fig. 13.29 


Although there are several claims to have seen WIMP signals, they are 
all controversial and there are no signals confirmed by two experiments. 
Larger detectors are planned with target masses of the order of a ton, 
and these should be sensitive to WIMP signals over the range of WIMP 
masses and cross sections expected in common SUSY models that are 
compatible with the astrophysical data (see Section 13.7.1). 


13.7.3 Dark matter annihilation 


If dark matter accumulates under gravity in the centres of massive ob- 
jects like the Sun, it should be possible to detect some of the products of 
the resulting annihilation reactions. One approach uses the muon neu- 
trinos resulting from cascade decays. Neutrino ‘telescopes’ should then 
be able to detect the neutrinos and show that they come from the centre 
of the Sun. One such neutrino telescope is the IceCube [90], which has 
strings of photomultipliers buried in the Antarctic ice, giving it a volume 
of about 1km*. Fast muons produce Cerenkov light, which is detected 
by the photomultipliers.2° There is a background from downward-going 
muons, but upward-going muons can only come from neutrinos that 
have travelled through the Earth. Satellite experiments like FERMI, 
Pamela, and AMS are searching for the antiparticles that are expected 
from WIMP annihilation in the galactic halo. The measured positron- 
to-electron ratio is increasing with energy as would be expected in dark 
matter models, but the data are not yet able to exclude conventional 
sources such as pulsars. 


13.8 Dark energy 


Astronomers have been trying to measure the expansion of the Universe 
to determine if it is ‘closed’ (i.e. the expansion will be reversed at some 
future time) or ‘open’ (i.e. it will continue to expand indefinitely). This 
requires a determination of the rate of expansion with distance scale at 
the largest distances. The Universe is known to be expanding (called 
the ‘Hubble expansion’ after the astronomer who discovered this effect). 
The recession velocity of distant objects can be measured by looking at 
the redshifts of spectral lines compared with laboratory measurements of 
the rest values. The redshift is defined in terms of the shift in wavelength 
by z = AA/A. The recession velocity @ can be determined from the 
Doppler formula 


(13.20) 
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Fig. 13.29 LUX detector showing the 
arrays of photomultiplier tubes and 
the electrodes to create the required 

electric fields [14]. 


30The cleanliness of the Antarctic ice 
results in a remarkably long attenu- 
ation length for the light of 55m at a 
wavelength of 470 nm [89]. 
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31This is referred to in astronomy as 
determining the intrinsic luminosity of 
‘standard candles’. For example, the 
periods of Cepheid variable stars are 
correlated with their intrinsic luminos- 
ities. As the distances of nearby Ce- 
pheid variables can be determined geo- 
metrically by parallax measurements, 
this allows an absolute calibration. 


32-Type Ia supernovae are believed to 
arise when a white dwarf star accretes 
enough mass from its main sequence 
(see Perkins in Further Reading if you 
are not familiar with this concept) 
companion that it exceeds the critical 
Chandrasekhar mass. The resulting nu- 
clear fusion creates nickel and iron. The 
radioactive nickel atoms decay. It is as- 
sumed that more massive stars have 
to expand for longer before the opa- 
city decreases enough for the photons 
to escape. 


This picture is appropriate for nearby galaxies, but at larger distances 
Newtonian gravity is no longer correct. The Hubble law can be expressed 
as a linear relation between redshift and distance. The distances (called 
the luminosity distances Dz) are estimated by comparing the measured 
fluxes F and intrinsic luminosities L of specific objects for which L can 
be determined using the inverse square law 


L 

= GD: (13.21) 
The main difficulty in this approach is the determination of L.*! For the 
largest distances the best method has used type Ia supernovae. Type 
Ia supernovae are bright enough to be detected at very large redshifts 
(and hence distances). By measuring the light output over a period of 
a few weeks, the width of the light curve can be measured. Empirically, 
for nearby supernovae for which the distances can be estimated with 
other techniques, there is a very good correlation between the width of 
the light curve and the intrinsic luminosity.°? This then allows type Ia 
supernovae to be used as standard candles. However, the further away 
and therefore older supernovae will have different chemical compositions 
compared with younger ones since they would have been formed from 
interstellar material that had not been through a cycle of nuclear fusion 
in stars. This effect might bias the calibration of the light curve to de- 
termine absolute luminosity. The resulting Hubble plot at large redshifts 
is shown in Fig. 13.30 and has clear deviations from linearity. The very 
surprising conclusion is that the expansion of the universe appears to be 
accelerating. More precise but less direct evidence for this acceleration 
can be determined from measurements of the anisotropy of the Cosmic 
Microwave Background (CMB) radiation. 


13.8.1 Theoretical implications 


One approach to understanding the accelerating expansion is to intro- 
duce a cosmological constant A into the Friedmann equation (see Perkins 
in Further Reading) for the expansion of the universe. The measured 
value corresponds to a value of approximately 5 GeV m~°. In quantum 
field theory, we expect the vacuum fluctuations to contribute to the en- 
ergy density. The integral for the energy density is divergent and if we 
assume there is a cut-off at the Planck scale, Mpjanck, we would expect 
(see Exercise 13.10) that A ~ 101?! GeV m~’. New physics such as SUSY 
at the TeV scale would provide a much lower cut-off, but this would still 
disagree with the measured value by a factor of 1015. One possible so- 
lution would be to assume that we need to modify Einstein’s theory of 
general relativity. Another option is that the cosmological constant does 
not arise from dark energy but from a new field called ‘quintessence’. 
A more radical alternative is based on the Anthropic Principle, accord- 
ing to which, in the context of inflationary models of the early universe, 
our universe might be just one universe in a larger multiverse and the 
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observed value of A might be a selection bias—only a universe with a 
sufficiently small value of A would be old enough to allow structures and 


life to evolve. 


Chapter summary 


e Many tests of the SM have already been performed at the LHC, giving 
confidence in the SM in this new regime of high CMS energy and low 
parton 7x. 

e Many searches for BSM physics have so far proved negative, but a much 
larger search window is being opened up now that the LHC CMS energy 
has been raised to 13 TeV. 

e A relatively low-energy e*e~ collider would be able to make complemen- 
tary, precision tests of the Higgs sector of the SM. 


e There is astrophysical evidence for dark matter. This has led to a new 
area of research with the aim of direct or indirect detection of WIMPs. 
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Fig. 13.30 (a) Hubble plot at moder- 
ate redshifts. (b) Hubble plot at large 
redshifts, with the linear trend seen at 
low z values divided out. The curves are 
fits to the Friedmann equation (see Per- 
kins in Further Reading) with different 
parameters [115]. 
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Further reading 


e Langacker, P. (2010). The Standard Model and Beyond. 
CRC Press. This has very thorough and advanced 
discussions of the SM and some BSM theories. 


e Martin, S. P. (1997). A Supersymmetry Primer. 
ar Xiv:hep-ph/970935. This is a good introduction to 
the phenomenology and theory of SUSY. 


e Aitchison, I. (2007). Supersymmetry in Particle Phys- 


ics: An Elementary Introduction. Cambridge Univer- 
sity Press. This gives a more advanced theoretical view 
of SUSY. 

Perkins, D. (2009). Particle Astrophysics (2nd edn). 
Oxford University Press. This is a very good introduc- 
tion to many areas of physics at the interface between 
particle physics and astrophysics. 


Exercises 


(13.1) By considering the combinatorics for one or two 
independent electron triggers to fire in Z — ete~ 
events, verify the formula for the trigger efficiency 
for the tag-and-probe analysis (eqn 13.1). 


(13.2) Consider the process pp > W — lv. If the W 
is produced with a mass Mw, determine an ex- 
pression for the parton momentum fractions xı 
and x2 of the protons in terms of the CMS en- 
ergy y's. If the W is produced at rest, show that 
zı = x2. Hence evaluate xı for this condition for 
the LHC operating at ys = 7 TeV. 


(13.3) What are the main SM backgrounds for a search 
involving final states with three charged leptons 
(as in the SUSY x$xj7 production)? How could 
the SM background be reduced so as to enhance 
the sensitivity of a SUSY search? How could 
the SM background be estimated with minimal 
reliance on Monte Carlo calculations? 


(13.4) If a y has the same electromagnetic couplings as 
a y, why does it have so much weaker interactions 
in matter? 


(13.5) Consider the elastic scattering of a (low-energy) 
WIMP with mass mwrmrp and momentum p with 
a nucleus of mass mn. Assuming that the WIMP 
is non-relativistic (a very good approximation), 
show that the maximum kinetic energy of the 
recoil nucleus is given by 


2mnp? 
(mn + mwimp) 


2 


Discuss the implications for the choice of nuclear 
target for optimal sensitivity for a given WIMP 
mass. Evaluate the recoil energy for the case of 
WIMPs moving with a speed v ~ 220kms~', 
for a WIMP mass mwimp = 1GeV, assuming 
that the target nucleus is xenon (atomic mass 
131). Repeat the calculation for a WIMP of mass 
mwimp = 100 GeV. Discuss the implications for 
the direct detection of WIMPs and in particular 
of low-mass WIMPs. 


(13.6) Consider a dark matter search using a xenon 
(atomic mass 131 AMU) detector with a target 
mass of 10 tons. Assume that the local energy 
density of WIMPs is p ~ 0.4GeVcm~® and 
that the WIMP speed (relative to the nuclei) is 
v = 220kms7!. If no events are observed in 1 
year of operation, then, explaining any approxi- 
mations you make, estimate a 90% confidence 
level upper limit to the WIMP-xenon cross sec- 
tion for a WIMP of mass 100 GeV. Why would a 
more accurate calculation result in less stringent 
limits. 


(13.7) Draw a Feynman diagram for the production of 
a W boson and a charm quark in pp interactions 
and hence explain how the measurement of this 
process can be used to determine the strange- 
quark parton distribution function at a value 
of Q? = MẸ. Experimentally, it is found that 
the ratio of parton distribution functions (see 
eqn 9.56) s(x, Q?)/us(z) has a value of less than 1 
at low values of Q? but is compatible with 1 at 


(13.8) 


(13.9) 


Q? = M@?,. If the charm quark produces a D 
meson, how might this process be tagged in an 
LHC experiment. 


Draw a Feynman diagram for the production of 
prompt photons in pp interactions. Hence explain 
how the measured rate of prompt photon produc- 
tion at the LHC could be used to constrain the 
gluon parton density function. 


Assuming that the angular distribution of jets 
(dN/dcos @*) in the parton—parton CMS is simi- 
lar to that in Rutherford scattering, show that 
the distribution in x (see eqn 13.17) will be 
approximately isotropic. Show that a uniform dis- 
tribution in cos @* will result in a peak in y near 


(13.10) 
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x = 1. Explain from an experimental perspec- 
tive the advantages to a search for new physics 
that uses the variable x as opposed to the jet 
transverse momentum pr. 


The density of states for a quantum oscillator 
is 4nVk? dk/(27)*, where V is the volume and 
k the momentum. Use this expression to evalu- 
ate the vacuum energy density, assuming that 
the divergent integral is cut off at some high 
energy scale Emax. Estimate this value assum- 
ing that the cut-off is given by the Planck mass 
Mpianck ~ 101° GeV. Compare this value with the 
critical energy density pe ~ 5GeV m7? and com- 
ment on the significance of this comparison. How 
would your conclusions change if we assumed that 
SUSY provided a cut-off at Emax ~ 1TeV? 
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